WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition

Karen Spärck Jones; Kevin Walker; Christopher Caruso; Jonathan Wright; Stephanie Strassel

WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition

Karen Jones, Kevin Walker, Christopher Caruso, Jonathan Wright, Stephanie Strassel

Abstract

The WeCanTalk (WCT) Corpus is a new multi-language, multi-modal resource for speaker recognition. The corpus contains Cantonese, Mandarin and English telephony and video speech data from over 200 multilingual speakers located in Hong Kong. Each speaker contributed at least 10 telephone conversations of 8-10 minutes’ duration collected via a custom telephone platform based in Hong Kong. Speakers also uploaded at least 3 videos in which they were both speaking and visible, along with one selfie image. At least half of the calls and videos for each speaker were in Cantonese, while their remaining recordings featured one or more different languages. Both calls and videos were made in a variety of noise conditions. All speech and video recordings were audited by experienced multilingual annotators for quality including presence of the expected language and for speaker identity. The WeCanTalk Corpus has been used to support the NIST 2021 Speaker Recognition Evaluation and will be published in the LDC catalog.

Anthology ID:: 2022.lrec-1.369
Volume:: Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 3451–3456
Language:
URL:: https://aclanthology.org/2022.lrec-1.369/
DOI:
Bibkey:
Cite (ACL):: Karen Jones, Kevin Walker, Christopher Caruso, Jonathan Wright, and Stephanie Strassel. 2022. WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3451–3456, Marseille, France. European Language Resources Association.
Cite (Informal):: WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition (Jones et al., LREC 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.lrec-1.369.pdf

PDF Cite Search Fix data

WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition

Abstract

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.