Learning Morphosyntactic Analyzers from the Bible via Iterative Annotation Projection across 26 Languages

Garrett Nicolai, David Yarowsky


Abstract
A large percentage of computational tools are concentrated in a very small subset of the planet’s languages. Compounding the issue, many languages lack the high-quality linguistic annotation necessary for the construction of such tools with current machine learning methods. In this paper, we address both issues simultaneously: leveraging the high accuracy of English taggers and parsers, we project morphological information onto translations of the Bible in 26 varied test languages. Using an iterative discovery, constraint, and training process, we build inflectional lexica in the target languages. Through a combination of iteration, ensembling, and reranking, we see double-digit relative error reductions in lemmatization and morphological analysis over a strong initial system.
Anthology ID:
P19-1172
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1765–1774
Language:
URL:
https://aclanthology.org/P19-1172/
DOI:
10.18653/v1/P19-1172
Bibkey:
Cite (ACL):
Garrett Nicolai and David Yarowsky. 2019. Learning Morphosyntactic Analyzers from the Bible via Iterative Annotation Projection across 26 Languages. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1765–1774, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Learning Morphosyntactic Analyzers from the Bible via Iterative Annotation Projection across 26 Languages (Nicolai & Yarowsky, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1172.pdf
Supplementary:
 P19-1172.Supplementary.pdf
Video:
 https://aclanthology.org/P19-1172.mp4

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy