Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation

Tosho Hirasawa, Emanuele Bugliarello, Desmond Elliott, Mamoru Komachi


Abstract
Multimodal machine translation (MMT) systems have been successfully developed in recent years for a few language pairs. However, training such models usually requires tuples of a source language text, target language text, and images. Obtaining these data involves expensive human annotations, making it difficult to develop models for unseen text-only language pairs. In this work, we propose the task of zero-shot cross-modal machine translation aiming to transfer multimodal knowledge from an existing multimodal parallel corpus into a new translation direction. We also introduce a novel MMT model with a visual prediction network to learn visual features grounded on multimodal parallel data and provide pseudo-features for text-only language pairs. With this training paradigm, our MMT model outperforms its text-only counterpart. In our extensive analyses, we show that (i) the selection of visual features is important, and (ii) training on image-aware translations and being grounded on a similar language pair are mandatory.
Anthology ID:
2023.wmt-1.47
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
522–535
Language:
URL:
https://aclanthology.org/2023.wmt-1.47/
DOI:
10.18653/v1/2023.wmt-1.47
Bibkey:
Cite (ACL):
Tosho Hirasawa, Emanuele Bugliarello, Desmond Elliott, and Mamoru Komachi. 2023. Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation. In Proceedings of the Eighth Conference on Machine Translation, pages 522–535, Singapore. Association for Computational Linguistics.
Cite (Informal):
Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation (Hirasawa et al., WMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wmt-1.47.pdf

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy