Where Do Translators Fit into Machine Translation?
By
Alex Gross
http://language.home.sprynet.com
alexilen@sprynet.com
Get the List of 4,500+ Translation Agencies Now! No Recurring Membership Fees!
Original
and Supplementary Questions
Submitted to the MT Summit III Conference,
Washington, 1991
Here
are the original questions for this panel as submitted
to the speakers:
1.
At the last MT Summit, Martin Kay stated that there
should be "greater attention to empirical studies
of translation so that computational linguists will
have a better idea of what really goes on in translation
and develop tools that will be more useful for the end
user." Does this mean that there has been insufficient
input into MT processes by translators interested in
MT? Does it mean that MT developers have failed to study
what translating actually entails and how translators
go about their task? If either of these is true, then
to what extent and why? New answers and insights for
the MT profession could arise from hearing what human
translators with an interest in the development of MT
have to say about these matters. It may well turn out
that translators are the very people best qualified
to determine what form their tools should take, since
they are the end users.
2. Is there a specifically "human" component
in the translation process which MT experts have overlooked?
Is it reasonable for theoreticians to envision setting
up predictable and generic vocabularies of clearly defined
terms, or could they be overlooking a deep-seated human
tendency towards some degree of ambiguityindeed,
in those many cases where not all the facts are known,
an inescapably human reliance on it? Are there any viable
MT approaches to duplicate what human translators can
provide in such cases, namely the ability to bridge
this ambiguity gap and improvise personalized, customized
case-specific subtleties of vocabulary, depending on
client or purpose? Could this in fact be a major element
of the entire translation process? Alternately, are
there some more boring "machine-like" aspects
of translation where the computer can help the translator,
such as style and consistency checking?
3. How can the knowledge of practicing translators best
be integrated into current MT research and working systems?
Is it to be assumed that they are best employed as prospective
end-users working out the bugs in the system, or is
there also a place for them during the initial planning
phases of such systems? Can they perhaps as users be
the primary developers of the system?
4. Many human translators, when told of the quest to
have machines take over all aspects of translation,
immediately reply that this is impossible and start
providing specific instances which they claim a machine
system could never handle. Are such reactions merely
the final nerve spasms of a doomed class of technicians
awaiting superannuation, or are these translators in
fact enunciating specific instances of a general law
as yet not fully articulated?
Since we now hear claims suggesting that FAHQT is creeping
in again through the back door, it seems important to
ask whether there has in fact ever been sufficient basic
mathematical research, much less algorithmic underpinnings,
by the MT Community to determine whether FAHQT, or anything
close to it, can be achieved by any combination of electronic
stratagems (transfer, AI, neural nets, Markov models,
etc.).
Must translators forever stand exposed on the firing
line and present their minds and bodies to a broadside
of claims that the next round of computer advances will
annihilate them as a profession? Is this problem truly
solvable in logical terms, or is it in fact an intractable,
undecidable, or provably unsolvable question in terms
of "Computable Numbers" as set out by Turing,
based on the work of Hilbert and Goedel? A reasonable
answer to this question could save boards of directors
and/or government agencies a great deal of time and
money.
SUPPLEMENTAL
QUESTIONS:
It
was also envisioned that a list of Supplemental Questions
would be prepared and distributed not only to the speakers
but everyone attending our panel, even though not all
of these questions could be raised during the session,
so as to deepen our discussion and provide a lasting
record of these issues.
FAHQT:
Pro and Con
Consider the following observation on FAHQT: "The
ideal notion of fully automatic high quality translation
(FAHQT) is still lurking behind the machine translation
paradigm: it is something that MT projects want to reach."
(1) Is this a true or a false observation?
Is FAHQT merely a matter of time and continued research,
a direct and inevitable result of a perfectly asymptotic
process?
Will FAHQT ever be available on a held-held calculator-sized
computer? If not, then why not?
To what extent is the belief in the feasibility of FAHQT
a form of religion or perhaps akin to a belief that
a perpetual motion device can be invented?
Technical
Linguistic Questions
Let us suppose a writer has chosen to use Word C in
a source text because s/he did not wish to use Word
A or Word B, even though all three are shown as "synonyms."
It turns out that all three of these words overlap and
semantically interrelate quite differently in the target
language. How can MT handle such an instance, fairly
frequently found in legal and diplomatic usage?
Virtually all research in both conventional and computational
linguistics has proceeded from the premise that language
can be represented and mapped as a linear entity and
is therefore eminently computable. What if it turns
out that language in fact occupies a virtual space as
a multi-dimensional construct, including several fractal
dimensions, involving all manner of non-linear turbulence,
chaos, and Butterfly Effects?
Post-Editors and Puppeteers
Let's assume you saw an ad for an Automatic Electronic
Puppeteer that guaranteed to create and produce endless
puppet plays in your own living room. There would be
no need for a puppeteer to run the puppets and no need
for you even to script the plays, though you would have
the freedom to intervene in the action and change the
plot as you wished. Since the price was acceptable,
you ordered this system, but when it arrived, you found
that it required endless installation work and calls
to the manufacturers to get it working. But even then,
you discovered that the number of plays provided was
in fact quite limited, your plot change options even
more so, and that the movements of the puppets were
jerky and unnatural. When you complained, you were referred
to fine print in the docs telling you that to make the
program work better, you would have to do one of two
things: 1) master an extremely complex programming language
or 2) hire a specially trained puppeteer to help you
out with your special needs and to be on hand during
your productions to make the puppets move more naturally.
Does this description bear any resemblance to the way
MT has functioned and been promoted in recent years?
A
Practical Example
Despite many presentations on linguistic, electronic
and philosophical aspects of MT at this conference,
one side of translation has nonetheless gone unexplored.
It has to do with how larger translation projects actually
arise and are handled by the profession. The following
story shows the world of human translation at close
to its worst, and it might be imagined at first glance
that MT could easily do a much better job and simply
take over in such situations, which are far from atypical
in the world of translation. But, as we shall see, such
appearances may be deceptive. To our story:
A French electrical firm was recently involved in a
hostile take-over bid and law suit with its American
counterpart. Large numbers of boxes and drawers full
of documents all had to be translated into English by
an almost impossible deadline. Supervision of this work
was entrusted to a paralegal assistant in the French
company's New York law firm. This person had no previous
knowledge of translation. The documents ran the gamut
from highly technical electrical texts and patents,
records of previous law suits, company correspondence,
advertisements, product documentation, speeches by the
Company's directors, etc.
Almost every French-to-English translator in the NYC
area was asked to take part. All translators were required
to work at the law firm's offices so as to preserve
confidentiality. Mere translation students worked side
by side with newly accredited professionals and journeymen
with long years of experience. The more able quickly
became aware that much of the material was far too difficult
for their less experienced colleagues. No consistent
attempt was made to create or distribute glossaries.
Wildly differing wages were paid to translators, with
little connection to their ability. Several translation
agencies were caught up in a feverish battle to handle
most of the work and desperately competed to find translators.
No one knows the quality of the final product, but it
cannot have been routinely high. Some translators and
agencies have still not been fully paid. As the deadline
drew closer, more and more boxes of documents appeared.
And as the final blow, the opposing company's law firm
also came onto the scene with boxes of its own documents
that needed translation. But these newcomers imposed
one nearly impossible condition, also for reasons of
confidentiality: no one who had translated for the first
law firm would be permitted to translate for them.
Now let us consider this true-life tale, which occurred
just three months ago, and see howor whetherMT
could have handled things better, as is sometimes claimed.
Let's be generous and remove one enormous obstacle at
the start by assuming that all these cases of documents
were in fact in machine-readable form (which, of course,
they weren't). Even if we accord MT this ample handicap,
there are still a number of problems it would have had
trouble coping with:
1. How could a sufficient number of competent post-editors
be found or trained before the deadline?
2. How could a sufficiently large and accurate MT dictionary
be compiled before the deadline? Doesn't creating such
a dictionary require finishing the job first and then
saving it for the next job, in the hope that it will
be similar ?
3. The simpler Mom & Pop store & smaller agency
structure of the human translation world was nonetheless
able to field at least some response to this challenge
because of its large slack capacity. Would an enormously
powerful and expensive mainframe computer have the same
slack capacity, i.e., could it be kept inactive for
long periods of time until such emergencies occurred?
If so, how would this be reflected in the prices charged
for its services?
4. How would MT companies have dealt with the secrecy
requirement, that translation must be done in the law
firm's office?
5. How would an MT Company comply with the demand of
the second law firm, that the same post-editors not
be used, and still land the job?
6. Supposing the job proved so enormous that two MT
firms had to be hiredassuming they used different
systems, different glossaries, different post-editors,
how could they have collaborated without creating even
more work and confusion?
Larger Philosophical Questions
Is it in any final sense a reasonable assumption, as
many believe, that progress in MT can be gradual and
cumulative in scope until it finally comes to a complete
mastery of the problem? In other words, is there a numerical
process by which one first masters 3% of all knowledge
and vocabulary building processes with 85% accuracy,
then 5% with 90% accuracy, and so on until one reaches
99% with 99% accuracy? Is this the whole story of the
relationship between knowledge and language, or are
there possibly other factors involved, making it possible
for reality to manifest itself from several unexpected
angles at once. In other words, are we dealing with
language as a linear entity when it is in fact a multi-dimensional
one?
Einstein maintained that he didn't believe God was playing
dice with the universe. Is it possible that by using
AI rule-firing techniques with their built-in certainty
and confidence values, computational linguists are playing
dice with the meaning of the that universe?
It would be possible to design a set of "Turing
Tests" to gauge the performance of various MT systems
as compared with human translation skills. The point
of such a process, as with all Turing Tests, would be
to determine if human referees could tell the difference
between human and machine output. All necessary safeguards,
handicaps, alternate referees, and double blind procedures
could be devised, provided the will to take part in
such tests actually existed. True definitions for cost,
speed, accuracy, and post-editing needs might all have
at least a chance of being estimated as a result of
such tests. What are the chances of their taking place
some time in the near future?
"Computerization is the first stage of the industrial
revolution that hasn't made work simpler." Does
this statement, paraphrased from a book by a Harvard
Business School professor, (2) have any relevance for
MT? Is it correct to state that several current MT systems
actually add one or more levels of difficulty to the
translation process before making it any easier?
While translators may not be able to articulate precisely
what kind of interface for translation they most desire,
they can certainly state with great certainty what they
do NOT want. What they do not want is an interface that
is any of the following:
harder
to learn and use than conventional translation;
more likely to make mistakes than the above;
lending less prestige than the above;
less well paid than the above.
Are
these also concerns for MT developers?
What real work has been done in the AI field in terms
of treating translation as a Knowledge Domain and translators
as Domain Experts and pairing them off with Knowledge
Engineers? What qualifications were sought in either
the DE's or the KE's?
Are MT developers using the words "asymptote"
and "asymptotic" in their correct mathematical
sense, or are they rather using them as buzzwords to
impart a false air of mathematical precision to their
work? Is the curve their would-be asymptote steadily
approaching a representation of FAHQT or something reasonably
similar, or could it just turn out to be the edge of
a semanto-linguistic Butterfly Effect drawing them inexorably
into what Shannon and Weaver recognized as entropy,
perhaps even into true Chaos?
Must not all translation, including MT, be recognized
as a subset of two far larger sets, namely writing and
human mediation? In the first case, does it not therefore
become pointless to maintain that there are no accepted
standards for what constitutes a "good translation,"
when of course there are also no accepted standards
for what constitutes "good writing?" Or for
that matter, no accepted standards for what constitutes
"correct writing practices," since all major
publications and publishing houses have their own in-house
style manuals, with no two in total agreement, either
here or in England. And is not translation also a specialized
subset of a more generalized form of "mediation,"
merely employing two natural languages instead of one?
In which case, may it belong to the same superset which
includes "explaining company rules to new employees,"
public relations and advertising, or choosing exactly
the right time to tell Uncle Louis you're marrying someone
he disapproves of?
Are not the only real differences between foreign language
translation and such upscale mediation that two languages
are involved and the context is usually more limited?
In either case (or in both together), what happens if
all the complexities that can arise from superset activities
descend into the subset and also become "translation
problems?" at any time? How does MT deal with either
of these cases?
Does the following reflection by Wittgenstein apply
to MT: "A sentence is given me in code together
with the key. Then of course in one way everything required
for understanding the sentence has been given me. And
yet I should answer the question `Do you understand
this sentence?': No, not yet; I must first decode it.
And only when e.g. I had translated it into English
would I say `Now I understand it.'
"If now we raise the question `At what moment of
translating do I understand the sentence? we shall get
a glimpse into the nature of what is called `understanding.'"
To take Wittgenstein's example one step further, if
MT is used, at what moment of translation does what
person or entity understand the sentence? When does
the system understand it? How about the hasty post-editor?
And what about the translation's target audience, the
client? Can we be sure that understanding has taken
place at any of these moments? And if understanding
has not taken place, has translation?
Practical
Suggestions for the Future
1. The process of consultation and cooperation between
working translators and MT specialists which has begun
here today should be extended into the future through
the appointment of Translators in Residence in university
and corporate settings, continued lectures and workshops
dealing with these themes on a national and international
basis, and greater consultation between them in all
matters of mutual concern.
2. In the past, many legislative titles for training
and coordinating workers have gone unused during each
Congressional session in the Department of Labor, HEW,
and Commerce. If there truly is a need for retraining
translators to use MT and CAT products, it behooves
system developersand might even benefit them financiallyto
find out if such funding titles can be used to help
train translators in the use of truly viable MT systems.
3. It should be the role of an organization such as
MT Summit III to launch a campaign aimed at helping
people everywhere to understand what human translation
and machine translation can and cannot do so as to counter
a growing trend towards fast-word language consumption
and use.
4. Concomitantly, those present at this Conference should
make their will known on an international scale that
there is no place in the MT Community for those who
falsify the facts about the capabilities of either MT
or human translators. The fact that foreign language
courses, both live and recorded, have been deceitfully
marketed for decades should not be used as an excuse
to do the same with MT. I have appended a brief Code
of Ethics document for discussion of this matter.
5. Since AI and expert systems are on the lips of many
as the next direction for MT, a useful first step in
this direction might be the creation of a simple expert
system which prospective clients might use to determine
if their translation needs are best met by MT, human
translation, or some combination of both. I would be
pleased to take part in the design of such a program.
DRAFT
CODE OF ETHICS:
1.
No claims about existing or pending MT products should
be made which indicate that MT can reduce the number
of human translators or the total cost of translation
work unless all costs for the MT project have been scrupulously
revealed, including the total price for the system,
fees or salaries for those running it, training costs
for such workers, training costs for additional pre-editors
or post-editors including those who fail at this task,
and total costs of amortization over the full period
of introducing such a system.
2. No claims should be made for any MT system in terms
of "percentage of accuracy," unless this figure
is also spelled out in terms of number of errors per
page. Any unwillingness to recognize errors as errors
shall be considered a violation of this condition, except
in those cases where totally error-free work is not
required or requested.
3. No claim should be made that any MT system produces
"better-quality output" than human translators
unless such a claim has been thoroughly quantified to
the satisfaction of all parties. Any such claim should
be regarded as merely anecdotal until proved otherwise.
4. Researchers and developers should devote serious
study to the issue of whether their products might generate
less sales resistance, public confusion, and resentment
from translators if the name of the entire field were
to be changed from "machine translation" or
"computer translation" to "computer assisted
language conversion."
5. The computer translation industry should bear the
cost of setting up an equitably balanced committee of
MT workers and translators to oversee the functioning
of this Code of Ethics.
6. Since translation is an intrinsically international
industry, this Code of Ethics must also be international
in its scope, and any company violating its tenets on
the premise that they are not valid in its country shall
be considered in violation of this Code. Measures shall
be taken to expose and punish habitual offenders.
Respectfully
Submitted by
Alex Gross, Co-Director
Cross-Cultural Research Projects
alexilen@sprynet.com
NOTES:
(1)
Kimmo Kettunen, in a letter to Computational Linguistics,
vol. 12, No. 1, January-March, 1986
(2)
(2) Shoshana Zuboff: In the Age of the Smart Machine:
The Future of Work and Power, Basic Books, 1991.
Read
more articles - Free!
E-mail
this article to your colleague!
Need
more translation jobs? Click here!
Translation
agencies are welcome to register here - Free!
Freelance
translators are welcome to register here - Free!
Subscribe
to TranslationDirectory.com newsletter - Free!
Take
part in TranslationDirectory.com poll - your voice counts!
|
|
|