The aim of the metric is to
be “a consistent standard against which the
translation quality of automotive service
information can be objectively measured,
- regardless of the source
language,
- regardless of the target
language,
- regardless
of how the translation is performed--i.e.,
human translation or machine
translation.”
This is a points based method.
The more points a translation
receives, the poorer the quality
of the translation.
As a brief overview, the metric
divides errors into seven categories, each
of which is outlined comprehensively
- Wrong
Term
- Wrong
Meaning
- Omission
- Structural
Error
- Misspelling
- Punctuation
Error
- Miscellaneous
Error
Each category is weighted and some error categories are considered to
impinge on quality more than others. Spelling
Errors for example always receive fewer points
than errors in the Wrong Term category.
When an error is identified as belonging to a category, the reviewer
decides whether the error is serious or minor.
A serious error is weighted to constitute
more points than a minor error.
Evaluation
The metric is easy to follow, easy to implement and is an excellent
step towards creating an objective, linguistic
quality measure. In addition it is highly
customisable and should you feel that a spelling
error is more damaging to a translation than
an incorrect term, it is easy to change the
weighting.
The results of the metrics can then be used for benchmarking linguistic
standards and serve as a basis for discussion
with both clients and translation groups.
However, there are the following drawbacks:
1. The metric
is meant to cover automotive service information
only and is therefore not a good tool for
evaluating a translation where style and voice
are important issues.
2. Results
for the reviews have to be collated manually
and comparisons can be difficult if you are
working directly from spreadsheets.
3. More guidance
is needed in the metric so that reviewers
are clear on what constitutes a serious and
a minor error. The evaluators then need to
be trained on this aspect so there is a clear
and common understanding. Also, the results
from our testing of the metric have shown
that two error categories are not sufficient.
4. Once the
points have been allocated, there is no guidance
as to what constitutes a good or a bad mark.
In one test of the metric we carried out,
translations of the same piece of text into
five different languages were sent to our
in-country reviewers. After evaluation we
found that the language which scored the second
best in points was given a scathing review
by the evaluator whereas reviewers for languages
that had scored more points were, overall,
happy with the translation.
The J2450 Translation Quality Metric can be purchased through the SAE internet site.