Administrador,+Brita+Banitzr+CT+40 1 PdfA
Administrador,+Brita+Banitzr+CT+40 1 PdfA
2020v40n1p54
Brita Banitz1
1
Universidad de las Américas Puebla, San Andrés Cholula, México
1. Introduction
1.2 Objectives
2. Approaches to MT
2.1 Rule-based MT
2.2 Statistical MT
3. Evaluation of MT output
18 27 29 119 26 97 20
19 28 26 85 23 62 25
20 17 16 38 15 36 14
21 83 74 219 72 182 66
22 44 44 150 44 104 42
23 31 25 103 26 101 29
24 45 44 198 42 155 49
total 687 683 630 624
average 28.6 28.5 92.2 26.3 73.1 26
Source: Autor
How do you judge the fluency How much of the meaning expressed in the reference
of this translation? translation is also expressed in the hypothesis
translation?
5 = Flawless German 5 = All
4 = Good German 4 = Most
3 = Non-native German 3 = Much
2 = Disfluent German 2 = Little
1 = Incomprehensible 1 = None
Source: Autor
The most common error for both systems was semantic in nature.
For Systran, this was an expected result since, according to Costa-
Jussà et al., RBMT systems follow a word-for-word translation
methodology, resulting in output that “tends to be literal and lacks
fluency” (Costa-Jussà et al. 252). Thus, a particular problem for
these systems is, therefore, lexical ambiguity where “one word can
be interpreted in more than one way” (Hutchins; Somers 85) as is
the case with homographs and polysems. Homographs are words
that are spelled the same way but that have different meanings.
Systran, for example, incorrectly translated the word “sentence”
in sentence 24 as “Strafe” [penalty] instead of “Satz” [sentence],
whereas Google translated the homograph correctly. Yet while
there were only three cases of homographs in the analyzed data
(two by Systran and one by Google), most of the sentences, for
both Systran and Google, had problems with polysemy.
Polysems are words carrying several related meanings. One
example included the verb “know” which was incorrectly translated
with “kennen” [to know somebody/something] by Systran yet
correctly rendered as “wissen” [to know something about somebody/
something] by Google. As is the case with polysems, the choice of
the correct target word depended on the context (Somers 431).
Polysemy was the most common error for both systems. Out of the
24 target sentences, 18 sentences translated by Systran had issues
with polysemy. For a statistical MT system like Google, however,
this problem was not expected since it was not listed as a potential
problem by Costa-Jussà et al. Out of the 24 source sentences, 13
had issues with polysemy, suggesting that the result is most likely a
function of the type of source text chosen for this analysis. Since it
is literary in nature, I believe it is open to more interpretation, thus
4. Conclusions
References
Farrús, Mireia et al. “Study and correlation analysis of linguistic, perceptual, and
automatic machine translation evaluations.” Journal of the American Society for
Information Science and Technology, 63.1 (2012): 174-184.
Snover, Matthew, et al. “A study of translation edit rate with targeted human
annotation.” Proceedings of association for machine translation in the Americas.
Vol. 200. nº. 6. 2006. Avaible to: https://www.cs.umd.edu/~snover/pub/
amta06/ter_amta.pdf. Accessed 13 February 2019.
Quah, Chiew Kin. Translation and technology. New York: Palgrave Macmillan,
2006.
Zydroń, Andrzej; Liu, Qun. “Measuring the benefits of using SMT.” MultiLingual,
1/2 (2017): 63-66. Avaible to: http://dig.multilingual.com/2017-01-02/index.
html?page=63. Accessed 13 February 2019.