mAggretriever: A Simple yet Effective Approach to Zero-Shot Multilingual Dense Retrieval

Sheng-Chieh Lin, Amin Ahmad, Jimmy Lin


Abstract
Multilingual information retrieval (MLIR) is a crucial yet challenging task due to the need for human annotations in multiple languages, making training data creation labor-intensive. In this paper, we introduce mAggretriever, which effectively leverages semantic and lexical features from pre-trained multilingual transformers (e.g., mBERT and XLM-R) for dense retrieval. To enhance training and inference efficiency, we employ approximate masked-language modeling prediction for computing lexical features, reducing 70–85% GPU memory requirement for mAggretriever fine-tuning. Empirical results demonstrate that mAggretriever, fine-tuned solely on English training data, surpasses existing state-of-the-art multilingual dense retrieval models that undergo further training on large-scale MLIR training data. Our code is available at url.
Anthology ID:
2023.emnlp-main.715
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11688–11696
Language:
URL:
https://aclanthology.org/2023.emnlp-main.715/
DOI:
10.18653/v1/2023.emnlp-main.715
Bibkey:
Cite (ACL):
Sheng-Chieh Lin, Amin Ahmad, and Jimmy Lin. 2023. mAggretriever: A Simple yet Effective Approach to Zero-Shot Multilingual Dense Retrieval. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11688–11696, Singapore. Association for Computational Linguistics.
Cite (Informal):
mAggretriever: A Simple yet Effective Approach to Zero-Shot Multilingual Dense Retrieval (Lin et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.715.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.715.mp4

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy