|
|
Pondering and Wondering
What better time than the end of the year to sit back and ponder the things that have happened in the past year and wonder what the next year will bring. (I won't even try to talk about the last and next decade!) There is a lot to wonder and ponder: What has been particularly notable in the past year in our industry? What were the things that riled us up, enraged us, excited us, or disgusted us?
Another of last year's notable topics was clearly machine translation. If you have not been involved in several discussions about machine translation with colleagues or other peers this past year, it's time for you to go out and get a social life! Companies like Google, Microsoft, Asia Online, and others have been pushing us to reconsider the applicability of machine translation on the basis of usability. The very concept of quality--which we also have had a love-hate relationship with for a long time--has seriously come under fire. The argument goes like this: Since translation quality is very abstract and arguable (yes, we would all agree here), the only relevant measure for translation is usefulness. For some kinds of texts, high stylistic standards are very important (think: literature, marketing); for others it's accuracy (think: legal, medical); and for still others the only thing that counts is the transfer of information (think: social networks, some technical documentation, customer support data). You may disagree with those classifications, but these are the lines that many very large corporations are drawing when deciding what to give to translators and what to have machine translation do. And then there is another topic that has come charging to the forefront in just the last few months: the availability of large amounts of bilingual data that can be used in translation memories. Here is just a sampling:
And then there are translation environment tools (TEnTs) like Lingotek's suite of tools, Google Translator Toolkit, and Wordfast's VLTM that are built around the concept of anonymous data sharing through translation memories or alignment tools like AlignFactory and NoBabel's AutoAligner that have finally made alignment of large amounts of web-based contents feasible. So what are we going to do with all of this? Is this sudden flood of data going to be helpful or harmful to our productivity via translation memory technology? The short answer is: I don't know. But I do have an inkling. When I first started to use translation environment tools (TEnTs), I was very eager to build up my own data so that I could benefit from my past labor. My "Big Mama TM" grew and grew, and I was always excited to find matches from (almost) forgotten previous projects. As the years passed, I continued to use and feed my meanwhile obscenely large Big Mama TM, but her usefulness seemed to decline rather than improve. Too much time had passed between the earlier projects and the current ones to really classify them when matches were displayed (despite every translation unit being described with subject and client information). In addition, language had changed and my skill levels had, too, causing a lot of time to be spent deleting or wading through useless suggestions from the TM. The fact that many of the newer TEnTs now also offered subsegment matching that allowed them to dig even deeper into the language materials did not help either. I have increasingly come to realize that while large amounts of data are very powerful, they can also be very distracting if they a) originate from a subject matter or client that uses a different terminology or style; b) come from dated or obsolete sources; and c) come from sources with a different quality level. So what does it mean to have all these gigantic data vaults at our disposal if my conclusions are true? I think that many of them are fantastic as reference materials, but I am just not sure about their value as translation memory data in the classic sense. And it's important to keep in mind that many of these resources were not produced for translation memory purposes (even though that may be their origin), but to feed the ever-hungry statistically based machine translation engines with their favorite food: bilingual data. Am I suddenly advocating the dismissal of translation memory technology? Not on your life! I still think that TM technology in concert with terminology resources should form the foundation of the tool kit of every translator who works on functional texts. But I have also come to the realization that raw data, including translation memory data, has no value per se. The value of data for the human translator is in its quality and appropriateness. Here's to a good and successful 2010 and 20-teens! Published - April 2010
|
|
|
Legal Disclaimer Site Map |