


default search action
26th SPECOM 2024: Belgrade, Serbia - Part I
- Alexey Karpov
, Vlado Delic
:
Speech and Computer - 26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25-28, 2024, Proceedings, Part I. Lecture Notes in Computer Science 15299, Springer 2025, ISBN 978-3-031-77960-2
Invited Papers
- Ivan Kraljevski
, Frank Duckhorn
, Daniel Sobe, Constanze Tschöpe
, Matthias Wolff
:
Preserving Language Heritage Through Speech Technology: The Case of Upper Sorbian. 3-22 - Milan Secujski
, Branislav M. Popovic
, Darko Pekar
, Niksa Jakovljevic
, Edvin Pakoci
, Sinisa Suzic
, Tijana V. Nosek
, Nikola Simic
, Vuk Stanojev
, Vlado Delic
:
Retrospective and Perspectives of TTS & STT Technology Development and Implementation for South Slavic Under-Resourced Languages. 23-42
Automatic Speech Recognition
- Yue Luo
, Péter Mihajlik
:
Comparison of Well and Lower-Resourced Self-training in ASR. 45-56 - Irina S. Kipyatkova
, Ildar Kagirov
, Mikhail Dolgushin
, Alexandra Rodionova
:
Towards a Livvi-Karelian End-to-End ASR System. 57-68 - Vishwa Gupta:
Advances in OpenASR21 Evaluation with Increased Temporal Resolution for Speech Self-supervised Learning Models. 69-81 - Sergei Katkov
, Antonio Liotta
, Alessandro Vietti
:
Benchmarking Whisper Under Diverse Audio Transformations and Real-Time Constraints. 82-91 - Ahmet Gunduz, Yunsu Kim, Kamer Ali Yuksel, Mohamed Al-Badrashiny, Thiago Castro Ferreira, Hassan Sawaf:
AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost. 92-103 - Manuel Torralbo
, Ariane Méndez, Maia Agirre, Arantza del Pozo
:
Pre-training and Adverse Audio Samples for Data-Efficient Wake Word Detection. 104-118 - Pranav Karande, Balaram Sarkar, Chandresh Kumar Maurya:
Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline. 119-133
Speech and Language Resources
- Nikola Ljubesic
, Peter Rupnik
, Danijel Korzinek
:
The ParlaSpeech Collection of Automatically Generated Speech and Text Datasets from Parliamentary Proceedings. 137-150 - Tatiana Y. Sherstinova
, Irina Petrova
:
ESC Corpus of Spoken Russian: Everyday Student Conversations Captured Through Continuous Speech Recording in Natural Communicative Environments. 151-162 - Denis Ivanko
, Dmitry Ryumin
, Alexandr Axyonov
, Alexey M. Kashevnik
, Alexey Karpov
:
OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled People. 163-173 - Velka Popova
, Dimitar Popov
:
Bulgarian Speech Resources in the CHILDES System. 174-186 - Natalia Bogdanova-Beglarian
, Olga Blinova
, Maria Khokhlova
, Tatiana Y. Sherstinova
, Tatiana I. Popova
:
Multiword Units in Russian Everyday Speech: Empirical Classification and Corpus-Based Studies. 187-200 - Rodmonga Potapova
, Vsevolod Potapov
, Ekaterina Karimova
, Leonid Motovskikh
, Nikolay Bobrov
:
Neurophysiological Correlates of Textual Modulation in Visual Stimuli: An Experimental Study of Russian and English Memes. 201-215
Speech Synthesis and Perception
- Tijana V. Nosek
, Sinisa Suzic
, Milan Secujski
, Vuk Stanojev
, Darko Pekar
, Vlado Delic
:
End-to-End Speech Synthesis for the Serbian Language Based on Tacotron. 219-229 - Shaimaa Alwaisi
, Mohammed Salah Al-Radhi
, Géza Németh:
ChildTinyTalks (CTT): A Benchmark Dataset and Baseline for Expressive Child Speech Synthesis. 230-240 - Anna Borzykh
, Tatiana Shevchenko
:
Multidimensional Rhythm: Comparing Rhythmic Properties of Australian and New Zealand Monologues. 241-250 - Anastasia Ananeva
, Uliana E. Kochetkova
:
Influence of Linguistic and Sociolinguistic Factors on Speech Rate Perception. 251-264 - Daria Guseva
, Olga Mitrofanova
, Mikhail Dolgushin
:
Human and Machine Keyphrase Perception in Russian Text and Speech. 265-280 - Elena E. Lyakso
, Olga V. Frolova
, Anton Matveev
, Aleksandr Nikolaev
, Ruban Nersisson
:
Assessment of Children's Ability to Manifest Emotions in Facial Expressions, Voice and Speech by Humans, Automatic, and on a Likert Scale. 281-294
Speech Processing for Medicine
- Gábor Gosztolya, László Tóth, Veronika Svindt, Judit Bóna, Ildikó Hoffmann:
Investigating the Utility of wav2vec 2.0 Hidden Layers for Detecting Multiple Sclerosis. 297-308 - Danila Mamontov, Sebastian Zepf, Alexey Karpov, Wolfgang Minker:
Cross-Cultural Automatic Depression Detection Based on Audio Signals. 309-323 - Lokesh Kumar, Kumar Kaustubh, S. R. Mahadeva Prasanna:
Depression Classification Using Token Merging-Based Speech Spectrotemporal Transformer. 324-335 - Mary Idamkina, Andrea Corradini:
Detecting Depression from Audio Data. 336-351 - Dosti Aziz
, Dávid Sztahó
:
Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural Network. 352-366 - German Egle
, Dariya Novokhrestova
, Svetlana Tomilina
, Evgeny Kostyuchenko
:
Approach to Assessing the Quality of Syllable Pronunciation by Patients in the Process of Speech Rehabilitation Based on Comparison with Healthy Speakers. 367-376 - Philipp L. Harnisch
, Daniel Schuhmann
, Stefan Hillmann
:
A Comparative Study for Contextualized Spoken Answer Classification in German Medical Questionnaires. 377-391

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.