How Much Context Span is Enough? Examining Context-Related Issues for Document-level MT

Sheila Castilho


Abstract
This paper analyses how much context span is necessary to solve different context-related issues, namely, reference, ellipsis, gender, number, lexical ambiguity, and terminology when translating from English into Portuguese. We use the DELA corpus, which consists of 60 documents and six different domains (subtitles, literary, news, reviews, medical, and legislation). We find that the shortest context span to disambiguate issues can appear in different positions in the document including preceding, following, global, world knowledge. Moreover, the average length depends on the issue types as well as the domain. Moreover, we show that the standard approach of relying on only two preceding sentences as context might not be enough depending on the domain and issue types.
Anthology ID:
2022.lrec-1.323
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3017–3025
Language:
URL:
https://aclanthology.org/2022.lrec-1.323/
DOI:
Bibkey:
Cite (ACL):
Sheila Castilho. 2022. How Much Context Span is Enough? Examining Context-Related Issues for Document-level MT. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3017–3025, Marseille, France. European Language Resources Association.
Cite (Informal):
How Much Context Span is Enough? Examining Context-Related Issues for Document-level MT (Castilho, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.323.pdf

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy