|
|
Advertisements |
|
|
|
Where Do Translators Fit into Machine Translation?
By Alex Gross
http://language.home.sprynet.com
alexilen@sprynet.com
Become a member of TranslationDirectory.com at just
$12 per month (paid per year)
Original and Supplementary Questions
Submitted to the MT Summit III Conference,
Washington, 1991
Here are the original questions for
this panel as submitted to the speakers:
1. At the last MT Summit, Martin Kay
stated that there should be "greater attention
to empirical studies of translation so that computational
linguists will have a better idea of what really
goes on in translation and develop tools that will
be more useful for the end user." Does this
mean that there has been insufficient input into
MT processes by translators interested in MT? Does
it mean that MT developers have failed to study
what translating actually entails and how translators
go about their task? If either of these is true,
then to what extent and why? New answers and insights
for the MT profession could arise from hearing what
human translators with an interest in the development
of MT have to say about these matters. It may well
turn out that translators are the very people best
qualified to determine what form their tools should
take, since they are the end users.
2. Is there a specifically "human" component
in the translation process which MT experts have
overlooked? Is it reasonable for theoreticians to
envision setting up predictable and generic vocabularies
of clearly defined terms, or could they be overlooking
a deep-seated human tendency towards some degree
of ambiguityindeed, in those many cases where
not all the facts are known, an inescapably human
reliance on it? Are there any viable MT approaches
to duplicate what human translators can provide
in such cases, namely the ability to bridge this
ambiguity gap and improvise personalized, customized
case-specific subtleties of vocabulary, depending
on client or purpose? Could this in fact be a major
element of the entire translation process? Alternately,
are there some more boring "machine-like"
aspects of translation where the computer can help
the translator, such as style and consistency checking?
3. How can the knowledge of practicing translators
best be integrated into current MT research and
working systems? Is it to be assumed that they are
best employed as prospective end-users working out
the bugs in the system, or is there also a place
for them during the initial planning phases of such
systems? Can they perhaps as users be the primary
developers of the system?
4. Many human translators, when told of the quest
to have machines take over all aspects of translation,
immediately reply that this is impossible and start
providing specific instances which they claim a
machine system could never handle. Are such reactions
merely the final nerve spasms of a doomed class
of technicians awaiting superannuation, or are these
translators in fact enunciating specific instances
of a general law as yet not fully articulated?
Since we now hear claims suggesting that FAHQT is
creeping in again through the back door, it seems
important to ask whether there has in fact ever
been sufficient basic mathematical research, much
less algorithmic underpinnings, by the MT Community
to determine whether FAHQT, or anything close to
it, can be achieved by any combination of electronic
stratagems (transfer, AI, neural nets, Markov models,
etc.).
Must translators forever stand exposed on the firing
line and present their minds and bodies to a broadside
of claims that the next round of computer advances
will annihilate them as a profession? Is this problem
truly solvable in logical terms, or is it in fact
an intractable, undecidable, or provably unsolvable
question in terms of "Computable Numbers"
as set out by Turing, based on the work of Hilbert
and Goedel? A reasonable answer to this question
could save boards of directors and/or government
agencies a great deal of time and money.
SUPPLEMENTAL QUESTIONS:
It was also envisioned that a list
of Supplemental Questions would be prepared and
distributed not only to the speakers but everyone
attending our panel, even though not all of these
questions could be raised during the session, so
as to deepen our discussion and provide a lasting
record of these issues.
FAHQT: Pro and Con
Consider the following observation on FAHQT: "The
ideal notion of fully automatic high quality translation
(FAHQT) is still lurking behind the machine translation
paradigm: it is something that MT projects want
to reach." (1) Is this a true or a false observation?
Is FAHQT merely a matter of time and continued research,
a direct and inevitable result of a perfectly asymptotic
process?
Will FAHQT ever be available on a held-held calculator-sized
computer? If not, then why not?
To what extent is the belief in the feasibility
of FAHQT a form of religion or perhaps akin to a
belief that a perpetual motion device can be invented?
Technical Linguistic Questions
Let us suppose a writer has chosen to use Word C
in a source text because s/he did not wish to use
Word A or Word B, even though all three are shown
as "synonyms." It turns out that all three
of these words overlap and semantically interrelate
quite differently in the target language. How can
MT handle such an instance, fairly frequently found
in legal and diplomatic usage?
Virtually all research in both conventional and
computational linguistics has proceeded from the
premise that language can be represented and mapped
as a linear entity and is therefore eminently computable.
What if it turns out that language in fact occupies
a virtual space as a multi-dimensional construct,
including several fractal dimensions, involving
all manner of non-linear turbulence, chaos, and
Butterfly Effects?
Post-Editors and Puppeteers
Let's assume you saw an ad for an Automatic Electronic
Puppeteer that guaranteed to create and produce
endless puppet plays in your own living room. There
would be no need for a puppeteer to run the puppets
and no need for you even to script the plays, though
you would have the freedom to intervene in the action
and change the plot as you wished. Since the price
was acceptable, you ordered this system, but when
it arrived, you found that it required endless installation
work and calls to the manufacturers to get it working.
But even then, you discovered that the number of
plays provided was in fact quite limited, your plot
change options even more so, and that the movements
of the puppets were jerky and unnatural. When you
complained, you were referred to fine print in the
docs telling you that to make the program work better,
you would have to do one of two things: 1) master
an extremely complex programming language or 2)
hire a specially trained puppeteer to help you out
with your special needs and to be on hand during
your productions to make the puppets move more naturally.
Does this description bear any resemblance to the
way MT has functioned and been promoted in recent
years?
A Practical Example
Despite many presentations on linguistic, electronic
and philosophical aspects of MT at this conference,
one side of translation has nonetheless gone unexplored.
It has to do with how larger translation projects
actually arise and are handled by the profession.
The following story shows the world of human translation
at close to its worst, and it might be imagined
at first glance that MT could easily do a much better
job and simply take over in such situations, which
are far from atypical in the world of translation.
But, as we shall see, such appearances may be deceptive.
To our story:
A French electrical firm was recently involved in
a hostile take-over bid and law suit with its American
counterpart. Large numbers of boxes and drawers
full of documents all had to be translated into
English by an almost impossible deadline. Supervision
of this work was entrusted to a paralegal assistant
in the French company's New York law firm. This
person had no previous knowledge of translation.
The documents ran the gamut from highly technical
electrical texts and patents, records of previous
law suits, company correspondence, advertisements,
product documentation, speeches by the Company's
directors, etc.
Almost every French-to-English translator in the
NYC area was asked to take part. All translators
were required to work at the law firm's offices
so as to preserve confidentiality. Mere translation
students worked side by side with newly accredited
professionals and journeymen with long years of
experience. The more able quickly became aware that
much of the material was far too difficult for their
less experienced colleagues. No consistent attempt
was made to create or distribute glossaries. Wildly
differing wages were paid to translators, with little
connection to their ability. Several translation
agencies were caught up in a feverish battle to
handle most of the work and desperately competed
to find translators.
No one knows the quality of the final product, but
it cannot have been routinely high. Some translators
and agencies have still not been fully paid. As
the deadline drew closer, more and more boxes of
documents appeared. And as the final blow, the opposing
company's law firm also came onto the scene with
boxes of its own documents that needed translation.
But these newcomers imposed one nearly impossible
condition, also for reasons of confidentiality:
no one who had translated for the first law firm
would be permitted to translate for them.
Now let us consider this true-life tale, which occurred
just three months ago, and see howor whetherMT
could have handled things better, as is sometimes
claimed. Let's be generous and remove one enormous
obstacle at the start by assuming that all these
cases of documents were in fact in machine-readable
form (which, of course, they weren't). Even if we
accord MT this ample handicap, there are still a
number of problems it would have had trouble coping
with:
1. How could a sufficient number of competent post-editors
be found or trained before the deadline?
2. How could a sufficiently large and accurate MT
dictionary be compiled before the deadline? Doesn't
creating such a dictionary require finishing the
job first and then saving it for the next job, in
the hope that it will be similar ?
3. The simpler Mom & Pop store & smaller
agency structure of the human translation world
was nonetheless able to field at least some response
to this challenge because of its large slack capacity.
Would an enormously powerful and expensive mainframe
computer have the same slack capacity, i.e., could
it be kept inactive for long periods of time until
such emergencies occurred? If so, how would this
be reflected in the prices charged for its services?
4. How would MT companies have dealt with the secrecy
requirement, that translation must be done in the
law firm's office?
5. How would an MT Company comply with the demand
of the second law firm, that the same post-editors
not be used, and still land the job?
6. Supposing the job proved so enormous that two
MT firms had to be hiredassuming they used
different systems, different glossaries, different
post-editors, how could they have collaborated without
creating even more work and confusion?
Larger Philosophical Questions
Is it in any final sense a reasonable assumption,
as many believe, that progress in MT can be gradual
and cumulative in scope until it finally comes to
a complete mastery of the problem? In other words,
is there a numerical process by which one first
masters 3% of all knowledge and vocabulary building
processes with 85% accuracy, then 5% with 90% accuracy,
and so on until one reaches 99% with 99% accuracy?
Is this the whole story of the relationship between
knowledge and language, or are there possibly other
factors involved, making it possible for reality
to manifest itself from several unexpected angles
at once. In other words, are we dealing with language
as a linear entity when it is in fact a multi-dimensional
one?
Einstein maintained that he didn't believe God was
playing dice with the universe. Is it possible that
by using AI rule-firing techniques with their built-in
certainty and confidence values, computational linguists
are playing dice with the meaning of the that universe?
It would be possible to design a set of "Turing
Tests" to gauge the performance of various
MT systems as compared with human translation skills.
The point of such a process, as with all Turing
Tests, would be to determine if human referees could
tell the difference between human and machine output.
All necessary safeguards, handicaps, alternate referees,
and double blind procedures could be devised, provided
the will to take part in such tests actually existed.
True definitions for cost, speed, accuracy, and
post-editing needs might all have at least a chance
of being estimated as a result of such tests. What
are the chances of their taking place some time
in the near future?
"Computerization is the first stage of the
industrial revolution that hasn't made work simpler."
Does this statement, paraphrased from a book by
a Harvard Business School professor, (2) have any
relevance for MT? Is it correct to state that several
current MT systems actually add one or more levels
of difficulty to the translation process before
making it any easier?
While translators may not be able to articulate
precisely what kind of interface for translation
they most desire, they can certainly state with
great certainty what they do NOT want. What they
do not want is an interface that is any of the following:
harder to learn and use than conventional
translation;
more likely to make mistakes than the above;
lending less prestige than the above;
less well paid than the above.
Are
these also concerns for MT developers?
What real work has been done in the AI field in
terms of treating translation as a Knowledge Domain
and translators as Domain Experts and pairing them
off with Knowledge Engineers? What qualifications
were sought in either the DE's or the KE's?
Are MT developers using the words "asymptote"
and "asymptotic" in their correct mathematical
sense, or are they rather using them as buzzwords
to impart a false air of mathematical precision
to their work? Is the curve their would-be asymptote
steadily approaching a representation of FAHQT or
something reasonably similar, or could it just turn
out to be the edge of a semanto-linguistic Butterfly
Effect drawing them inexorably into what Shannon
and Weaver recognized as entropy, perhaps even into
true Chaos?
Must not all translation, including MT, be recognized
as a subset of two far larger sets, namely writing
and human mediation? In the first case, does it
not therefore become pointless to maintain that
there are no accepted standards for what constitutes
a "good translation," when of course there
are also no accepted standards for what constitutes
"good writing?" Or for that matter, no
accepted standards for what constitutes "correct
writing practices," since all major publications
and publishing houses have their own in-house style
manuals, with no two in total agreement, either
here or in England. And is not translation also
a specialized subset of a more generalized form
of "mediation," merely employing two natural
languages instead of one? In which case, may it
belong to the same superset which includes "explaining
company rules to new employees," public relations
and advertising, or choosing exactly the right time
to tell Uncle Louis you're marrying someone he disapproves
of?
Are not the only real differences between foreign
language translation and such upscale mediation
that two languages are involved and the context
is usually more limited? In either case (or in both
together), what happens if all the complexities
that can arise from superset activities descend
into the subset and also become "translation
problems?" at any time? How does MT deal with
either of these cases?
Does the following reflection by Wittgenstein apply
to MT: "A sentence is given me in code together
with the key. Then of course in one way everything
required for understanding the sentence has been
given me. And yet I should answer the question `Do
you understand this sentence?': No, not yet; I must
first decode it. And only when e.g. I had translated
it into English would I say `Now I understand it.'
"If now we raise the question `At what moment
of translating do I understand the sentence? we
shall get a glimpse into the nature of what is called
`understanding.'" To take Wittgenstein's example
one step further, if MT is used, at what moment
of translation does what person or entity understand
the sentence? When does the system understand it?
How about the hasty post-editor? And what about
the translation's target audience, the client? Can
we be sure that understanding has taken place at
any of these moments? And if understanding has not
taken place, has translation?
Practical
Suggestions for the Future
1. The process of consultation and cooperation between
working translators and MT specialists which has
begun here today should be extended into the future
through the appointment of Translators in Residence
in university and corporate settings, continued
lectures and workshops dealing with these themes
on a national and international basis, and greater
consultation between them in all matters of mutual
concern.
2. In the past, many legislative titles for training
and coordinating workers have gone unused during
each Congressional session in the Department of
Labor, HEW, and Commerce. If there truly is a need
for retraining translators to use MT and CAT products,
it behooves system developersand might even
benefit them financiallyto find out if such
funding titles can be used to help train translators
in the use of truly viable MT systems.
3. It should be the role of an organization such
as MT Summit III to launch a campaign aimed at helping
people everywhere to understand what human translation
and machine translation can and cannot do so as
to counter a growing trend towards fast-word language
consumption and use.
4. Concomitantly, those present at this Conference
should make their will known on an international
scale that there is no place in the MT Community
for those who falsify the facts about the capabilities
of either MT or human translators. The fact that
foreign language courses, both live and recorded,
have been deceitfully marketed for decades should
not be used as an excuse to do the same with MT.
I have appended a brief Code of Ethics document
for discussion of this matter.
5. Since AI and expert systems are on the lips of
many as the next direction for MT, a useful first
step in this direction might be the creation of
a simple expert system which prospective clients
might use to determine if their translation needs
are best met by MT, human translation, or some combination
of both. I would be pleased to take part in the
design of such a program.
DRAFT CODE OF ETHICS:
1. No claims about existing or pending
MT products should be made which indicate that MT
can reduce the number of human translators or the
total cost of translation work unless all costs
for the MT project have been scrupulously revealed,
including the total price for the system, fees or
salaries for those running it, training costs for
such workers, training costs for additional pre-editors
or post-editors including those who fail at this
task, and total costs of amortization over the full
period of introducing such a system.
2. No claims should be made for any MT system in
terms of "percentage of accuracy," unless
this figure is also spelled out in terms of number
of errors per page. Any unwillingness to recognize
errors as errors shall be considered a violation
of this condition, except in those cases where totally
error-free work is not required or requested.
3. No claim should be made that any MT system produces
"better-quality output" than human translators
unless such a claim has been thoroughly quantified
to the satisfaction of all parties. Any such claim
should be regarded as merely anecdotal until proved
otherwise.
4. Researchers and developers should devote serious
study to the issue of whether their products might
generate less sales resistance, public confusion,
and resentment from translators if the name of the
entire field were to be changed from "machine
translation" or "computer translation"
to "computer assisted language conversion."
5. The computer translation industry should bear
the cost of setting up an equitably balanced committee
of MT workers and translators to oversee the functioning
of this Code of Ethics.
6. Since translation is an intrinsically international
industry, this Code of Ethics must also be international
in its scope, and any company violating its tenets
on the premise that they are not valid in its country
shall be considered in violation of this Code. Measures
shall be taken to expose and punish habitual offenders.
Respectfully Submitted by
Alex Gross, Co-Director
Cross-Cultural Research Projects
alexilen@sprynet.com
NOTES:
(1) Kimmo Kettunen, in a letter to Computational
Linguistics, vol. 12, No. 1, January-March, 1986
(2) (2) Shoshana Zuboff: In the Age
of the Smart Machine: The Future of Work and Power,
Basic Books, 1991.
Submit your article!
Read more articles - free!
Read sense of life articles!
E-mail
this article to your colleague!
Need
more translation jobs? Click here!
Translation
agencies are welcome to register here - Free!
Freelance
translators are welcome to register here - Free!
|
|
|
Free
Newsletter |
|
|
|
|