Machine Translation (MT) - the 80% Solution?
In 2000 LISA Director Michael Anobile and then Newsletter Editor Deborah Fry spoke with Tom Lueck, CEO of veteran machine translation company Logos, about machine translation, the Internet and the future of the language technology market. We are running this article from the 2000 Archives to allow readers to decide what progress, and how much, has been made during the last three years in the field of MT.
The language industry is expanding rapidly, but its growth seems out of sync with language technology’s ability to provide meaningful support. Will this change over the next five years?
There will definitely be a proliferation of tools and technology into this business, driven by the sheer size of the market and the demand for translation and localization. The current situation is still characterized not by suites of highly integrated tools but by “islands of translation” that somehow communicate, but don’t really work together well.
To give an analogy, we are in a similar situation to that of robotics or manufacturing before computer-integrated manufacturing became the paradigm. There is no highly integrated suite of tools that would support a truly intelligent process. And if you don’t have a fully integrated process, you can’t really take full advantage of what machine translation (MT) could bring to the party.
The question is, therefore, how to create an integrated process composed of different tools that today are supplied by individual companies. This implies the need for strategic alliances, and possibly mergers or pooling of interests. In my view, critical mass is still lacking in a lot of cases.
The problem still remains that, as it stands, MT doesn’t work. What do customers have to do to change this?
In fact, MT does work, but only within an appropriate process! Therefore, customers have to change the way they perceive their language or translation needs - in other words, how and where they apply MT. You have to pick your application properly: where MT works, for instance, is in high-volume applications with structured texts that are well written and possibly preedited. If you have a document that MT can’t handle well, you get what’s commonly referred to as gisting.
This is what many MT suppliers are now offering for, say, online news.
Yes, and for some applications this may be extremely attractive - especially when the consumer does not have to pay for the service. However, the minute you put a price tag on it, you will see that people will become very sensitive to quality, and you will need a very robust system such as Logos or Systran. Even so, you are still going to get somewhat unpredictable results if you just use MT alone. You probably have to use an integrated system that combines MT with translation memory (TM). Kevin Cavanaugh from Lotus used the term “TMT” to describe this four years ago in Boston.
So why isn’t MT being pursued more aggressively by clients with suitable applications?
First of all, the problem might be to recognize a suitable application when you have it. Even then, the technology is inherently complicated: it is not “one size fits all.” Furthermore, if a company views this application as non-strategic, the decision may be to outsource it, thereby losing control over the decision as to whether or not to use MT. However, maybe over time they will develop a feeling for whether to take a solution in-house, as their volume, timeliness and cost issues dictate. This is why we offer our clients both full service MT and in-house MT solutions. I do not want to force MT onto anybody because in a lot of cases I know the customer isn’t ready to bring it in-house.
What does this mean for suppliers?
It means that today’s MT developers need to expand their vision to embrace fully integrated solutions. Ideally, this means controlling your own destiny by participating in the entire process from authoring to the finished translation. However, we have obtained significant performance improvements over conventional methods even where documents were not optimized for MT. This was due not only to the machine but also to a robust, integrated process.
This comes back to your point about the need for integration.
Absolutely. You really have to take a top-down view, and one that goes from A to Z. In other words, what we really need is an authoring environment - but one that is sensitive to machine translation, one which knows how the MT system reacts to things I do to my document, while I am writing it. Of course, this is not trivial, and may require alliances with suppliers that specialize in such authoring systems. For pre-existing text you could use what we call a “translatability index” that essentially tells you ahead of time whether a sentence can be parsed by the system. This creates the opportunity to improve the document’s “machine translatability.” However, the market is so dynamic and document content is so dynamic that nobody really has the time to really focus on this issue. The reality is that document generation and translation are not properly connected.
So how do you sell MT?
Selling MT is a misnomer; what MT suppliers must do is identify those solutions or applications that can’t be done without MT. They then need to devise appropriate business relationships that put the MT technology in the hands of those who need it. In other words, we are not selling a product, we’re partnering with tools suppliers, translation professionals and clients to create solutions. This integration and partnering process is inherently slow and time consuming.
Isn’t the real problem the fact that language isn’t recognized as strategic? If it were, companies would be prepared to invest.
Yes, this is indeed the problem! However, in today’s dynamic environment, companies are being forced into the realization of how strategic language really is. The speed of the Internet, connectivity of the Web, the globalization of business and industry are all contributing to creating language awareness. Coincidentally, this creates exactly the kind of environment conducive to the use of MT - massive amounts of translation required in minuscule timeframes. Also, if companies knew how much they are already spending on translation, it might finally get upper management’s attention as a strategic issue.
Our members such as IBM, Oracle and Microsoft claim to know what they are spending.
Absolutely, but as truly global players they have already gone through the learning curve of how best to deal with multilingual content and communication. As a by-product of having a process that deals with translation, you get the ability to recognize and control costs. For other emerging global players, the need for multilingual solutions has just appeared on the radar screen of top management.
Mergers and acquisitions will also drive this process. For example, multinationals like DaimlerChrysler and Deutsche Bank/Bankers Trust have a huge communications problem - and one that has nothing to do with localization, by the way. In this example, you need to enable people to communicate across their corporation in at least two languages, and MT is the optimal solution. For instance, our customer Osram in Germany has been using the Logos system as an integrated communications enabling device for six years now. They send e-mails to a Logos server that resides on their network, and the system does the translation automatically. Optionally, some postediting can be done before the e-mail is dispatched.
So what sort of technology do you need to really take advantage of the Web?
If you want to use the Web to communicate with a global audience, you need a fully integrated translation solution that is online. Integration in this context means a workflow manager that orchestrates terminology management, translation memory and machine translation. This implies that vendors who want to deliver such solutions may have to turn themselves into application service providers.
How far off are we from this scenario?
I think it has already started. For instance, we have always cooperated with TM manufacturers, and we were the first to offer an integrated solution involving translation memory. We started with XL8 in the old days, and we have long offered integration with Star and Trados.
For Web applications you need a server-based, workflowdriven corporate TM system with a large legacy translation database linked to a powerful MT system. Ideally, you can also get the author of the document to adhere to certain standards that make it easy for the system to do the translation.
You’re right - if we had perfect, completely fault-tolerant MT systems, then of course we wouldn’t have to worry about writing styles or “controlled” languages. However, perfect MT doesn’t yet exist. Therefore, controlling the input text is still the most effective way to maximize the results of the systems that we have today. The drawback is that these authoring systems are often perceived as straightjackets by writers. Also, experience shows that corporate style guides are generally not adhered to. On pre-existing documents, you could optimize MT results by running a preprocessor that checks the writing style for translatability before the document is actually translated.
What role do standards play here?
In the context of MT, standards really only play a role when it comes to terminology. For instance, users of MT may want to exchange dictionaries between dissimilar MT engines. This is why Logos Corporation conforms to OLIF, which was developed as part of the European Otelo project.
Given an integrated system that also incorporates TM, there is a need to exchange the contents of sentence memories between systems. In this context, LISA’s TMX is the definitive standard.
Will the need for integrated solutions drive a wave of mergers and acquisitions?
Yes, I believe so. Individual players are doing a good job in their respective domains, but the customer needs a fully integrated solution. This requires cooperation on an ongoing basis and vast vendor resources. However, everybody is already spending a lot of money just optimizing their own core technology. Also, if you look at today’s companies, they all started from the ground up, developing technical solutions that rarely contemplate the full picture. This is a big problem when what the customer wants is an integrated solution.
As we all know, Lernout & Hauspie have tried to solve this integration problem by acquiring a variety of technologies and companies, which has given them early access to the capital market. The problem that they are facing now is how to integrate the different technologies and entities - this could prove to be very, very difficult.
What buttons will companies have to press to raise the necessary finance?
This is not easy. Historically, we were caught in a Catch 22 situation because people hesitated to invest in something as difficult as MT. Venture capitalists, for instance, want simple deals and fast-track results. After you have talked to them for half an hour about MT they say: “Oh, this is too tough.” Not only are you facing an extremely complex technical and technological problem, you are also faced with a very difficult market picture. Today however, the dynamics of the Internet and the emergence of e-commerce have created a new language awareness conducive to the adoption of automated translation solutions. Investment patterns are changing in response to this - as is evidenced by the funding of numerous companies in the domain of language automation.
A number of companies in the language business have recently raised capital from venture capital funds or IPOs. What did they do right?
Several start-ups have gained the attention of the marketplace because they focus on solutions that venture capitalists can easily understand and relate to. Money flows to those situations that give the appearance of being marketing-driven, or of addressing language problems in a tangible way. To the extent that venture capitalists begin to understand more fully the pivotal role that MT plays in Web-based, B2B translation solutions, increased investment will result. Where the funding of MT companies has been problematic is in standalone applications targeting broad consumer markets.
Surely that’s a legitimate use for MT?
It is okay, but it hurts the perception of MT technology. This is because, by and large, MT can’t deliver on consumer quality expectations. As mentioned before, with MT you should ideally have control over the document you’re translating and this is seldom the case in an Internet environment. This means that you could obtain totally erratic results in translation. MT is inherently an imperfect technology and will always be so. Does that mean it isn’t useful, or that you cannot get superior economics? Absolutely not! Our customers have proven it. With proper process you can get 95% perfect translations and 60-70% cost savings.
What is the role of LISA with respect to MT?
Spread the word! MT as a technology is becoming increasingly robust and integrated into many of LISA’s member companies and their products. MT has clearly become an integral part of comprehensive translation solutions.
LISA’s role in providing a forum for the exchange of information between tools suppliers of all kinds and their users will help drive further the successful integration of translation technologies. Ultimately, the really robust and complete language solutions will emerge from this collaboration.
Thank you very much.
Jens Thomas Lueck is former President & CEO of the Logos Corporation.
Reprinted by permission from the Globalization Insider,
1 July 2003, Volume XII, Issue 3.1.
Copyright the Localization Industry Standards Association
(Globalization Insider: www.localization.org, LISA: www.lisa.org)
and S.M.P. Marketing Sarl (SMP) 2004
Please see some ads as well as other content from TranslationDirectory.com: