CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT

Petr Gebauer, Ondřej Bojar, Vojtěch Švandelík, Martin Popel


Abstract
We describe our two NMT systems submitted to the WMT2021 shared task in English-Czech news translation: CUNI-DocTransformer (document-level CUBBITT) and CUNI-Marian-Baselines. We improve the former with a better sentence-segmentation pre-processing and a post-processing for fixing errors in numbers and units. We use the latter for experiments with various backtranslation techniques.
Anthology ID:
2021.wmt-1.7
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Editors:
Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
123–129
Language:
URL:
https://aclanthology.org/2021.wmt-1.7/
DOI:
Bibkey:
Cite (ACL):
Petr Gebauer, Ondřej Bojar, Vojtěch Švandelík, and Martin Popel. 2021. CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT. In Proceedings of the Sixth Conference on Machine Translation, pages 123–129, Online. Association for Computational Linguistics.
Cite (Informal):
CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT (Gebauer et al., WMT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wmt-1.7.pdf

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy