


default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 25
Volume 25, Number 1, January 2017
- Jin Chu Wu, Alvin F. Martin, Craig S. Greenberg, Raghu N. Kacker:
The Impact of Data Dependence on Speaker Recognition Evaluation. 1-14 - Hélène Papadopoulos, George Tzanetakis
:
Models for Music Analysis From a Markov Logic Networks Perspective. 15-30 - Ahmed Al-Tmeme
, Wai Lok Woo
, Satnam Singh Dlay, Bin Gao:
Underdetermined Convolutive Source Separation Using GEM-MU With Variational Approximated Optimum Model Order NMF2D. 31-45 - Mark A. Hasegawa-Johnson
, Preethi Jyothi, Daniel McCloy
, Majid Mirbagheri, Giovanni M. Di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy F. Chen
, Paul Hager, Tyler Kekona, Rose Sloan, Adrian K. C. Lee
:
ASR for Under-Resourced Languages From Probabilistic Transcription. 46-59 - Zhen Huang, Sabato Marco Siniscalchi, Chin-Hui Lee:
Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition. 60-71 - Simon Durand, Juan Pablo Bello
, Bertrand David, Gaël Richard:
Robust Downbeat Tracking Using an Ensemble of Convolutional Networks. 72-85 - Zhehuai Chen, Yimeng Zhuang, Yanmin Qian, Kai Yu
:
Phone Synchronous Speech Recognition With CTC Lattices. 86-97 - Bo Wu, Kehuang Li, Minglei Yang, Chin-Hui Lee:
A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks. 98-107 - Hongjie Chen
, Lei Xie, Cheung-Chi Leung, Xiaoming Lu, Bin Ma, Haizhou Li
:
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News. 108-119 - Hua Xing, John H. L. Hansen:
Single Sideband Frequency Offset Estimation and Correction for Quality Enhancement and Speaker Recognition. 120-132 - Andreas I. Koutrouvelis, Richard Christian Hendriks, Richard Heusdens, Jesper Jensen:
Relaxed Binaural LCMV Beamforming. 133-148 - Morten Kolbæk
, Zheng-Hua Tan, Jesper Jensen:
Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems. 149-163 - Jakob Abeßer
, Klaus Frieler, Estefanía Cano
, Martin Pfleiderer, Wolf-Georg Zaddach:
Score-Informed Analysis of Tuning, Intonation, Pitch Modulation, and Dynamics in Jazz Solos. 168-177 - Alastair H. Moore
, Christine Evers
, Patrick A. Naylor
:
Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors. 178-192 - Kun Li, Xiaojun Qian, Helen M. Meng:
Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks. 193-207 - Yoonchang Han, Jae-Hun Kim, Kyogu Lee
:
Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music. 208-221
Volume 25, Number 2, February 2017
- Hanchi Chen, Thushara Dheemantha Abhayapala, Prasanga N. Samarasinghe
, Wen Zhang:
Direct-to-Reverberant Energy Ratio Estimation Using a First-Order Microphone. 226-237 - Peter Bell, Pawel Swietojanski
, Steve Renals
:
Multitask Learning of Context-Dependent Targets in Deep Neural Network Acoustic Models. 238-247 - Rui Zhao, Kezhi Mao
:
Topic-Aware Deep Compositional Models for Sentence Classification. 248-260 - Dalia El Badawy, Ngoc Q. K. Duong, Alexey Ozerov:
On-the-Fly Audio Source Separation - A Novel User-Friendly Framework. 261-272 - Filip Elvander
, Johan Sward, Andreas Jakobsson
:
Online Estimation of Multiple Harmonic Signals. 273-284 - Vincent Renkens
, Hugo Van hamme
:
Weakly Supervised Learning of Hidden Markov Models for Spoken Language Acquisition. 285-295 - Luca Remaggi, Philip J. B. Jackson
, Philip Coleman
, Wenwu Wang:
Acoustic Reflector Localization: Novel Image Source Reversion and Direct Localization Methods. 296-309 - Prasanga N. Samarasinghe
, Thushara D. Abhayapala, Hanchi Chen:
Estimating the Direct-to-Reverberant Energy Ratio Using a Spherical Harmonics-Based Spatial Correlation Model. 310-319 - Shmulik Markovich Golan, Sharon Gannot
, Walter Kellermann:
Combined LCMV-TRINICON Beamforming for Separating Multiple Speech Sources in Noisy and Reverberant Environments. 320-332 - Shakeel Ahmed, Muhammad Tahir Akhtar
:
Gain Scheduling of Auxiliary Noise and Variable Step-Size for Online Acoustic Feedback Cancellation in Narrow-Band Active Noise Control Systems. 333-343 - Gabriel Sargent
, Frédéric Bimbot, Emmanuel Vincent:
Estimating the Structural Segmentation of Popular Music Pieces Under Regularity Constraints. 344-358 - Jordan Cheer
, Stephen Daley:
An Investigation of Delayless Subband Adaptive Filtering for Multi-Input Multi-Output Active Noise Control Applications. 359-373 - Sebastian J. Schlecht
, Emanuël A. P. Habets:
Feedback Delay Networks: Echo Density and Mixing Time. 374-383 - Johannes Abel
, Magdalena Kaniewska, Cyril Guillaume, Wouter Tirry, Tim Fingscheidt
:
An Instrumental Quality Measure for Artificially Bandwidth-Extended Speech Signals. 384-396 - Robert Rehr, Timo Gerkmann
:
An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation. 397-408 - Emilio Granell
, Carlos D. Martínez-Hinarejos
:
Multimodal Crowdsourcing for Transcribing Handwritten Documents. 409-419 - Yaping Ma, Yegui Xiao:
A New Strategy for Online Secondary-Path Modeling of Narrowband Active Noise Control. 420-434 - Jose A. Belloch
, Alberto González
, Enrique S. Quintana-Ortí
, Miguel Ferrer
, Vesa Välimäki
:
GPU-Based Dynamic Wave Field Synthesis Using Fractional Delay Filters and Room Compensation. 435-447
Volume 25, Number 3, March 2017
- Qi He, Feng Bao, Changchun Bao:
Multiplicative Update of Auto-Regressive Gains for Codebook-Based Speech Enhancement. 457-468 - Zhongqing Wang
, Sophia Yat Mei Lee, Shoushan Li, Guodong Zhou
:
Emotion Analysis in Code-Switching Text With Joint Factor Graph Model. 469-480 - Ashwin Bellur, Mounya Elhilali
:
Feedback-Driven Sensory Mapping Adaptation for Robust Speech Activity Detection. 481-492 - Zhiyuan Tang, Lantian Li
, Dong Wang
, Ravichander Vipperla:
Collaborative Joint Training With Multitask Recurrent Model for Speech and Speaker Recognition. 493-504 - Bidisha Sharma, S. R. Mahadeva Prasanna:
Sonority Measurement Using System, Source, and Suprasegmental Information. 505-518 - Hung-yi Lee, Bo-Hsiang Tseng, Tsung-Hsien Wen, Yu Tsao
:
Personalizing Recurrent-Neural-Network-Based Language Model by Social Network. 519-530 - Ji Ming, Danny Crookes:
Speech Enhancement Based on Full-Sentence Correlation and Clean Speech Recognition. 531-543 - Quoc Truong Do, Tomoki Toda
, Graham Neubig, Sakriani Sakti, Satoshi Nakamura:
Preserving Word-Level Emphasis in Speech-to-Speech Translation. 544-556 - Zhenghua Li, Jiayuan Chao, Min Zhang, Wenliang Chen, Meishan Zhang
, Guohong Fu:
Coupled POS Tagging on Heterogeneous Annotations. 557-571 - Clement S. J. Doire
, Mike Brookes
, Patrick A. Naylor
, Christopher M. Hicks, Dave Betts, Mohammad A. Dmour, Søren Holdt Jensen:
Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise. 572-587 - Aleksandr Sizov, Kong-Aik Lee
, Tomi Kinnunen:
Direct Optimization of the Detection Cost for I-Vector-Based Spoken Language Recognition. 588-597 - Imran A. Sheikh
, Dominique Fohr, Irina Illina, Georges Linarès:
Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition. 598-610 - Mojtaba Farmani
, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen:
Informed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications. 611-623 - Vikram C. M., S. R. Mahadeva Prasanna:
Epoch Extraction From Telephone Quality Speech Using Single Pole Filter. 624-636 - Motoi Omachi
, Tetsuji Ogawa
, Tetsunori Kobayashi:
Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation. 637-650 - Dani Cherkassky, Sharon Gannot
:
Blind Synchronization in Wireless Acoustic Sensor Networks. 651-661 - Laurent Girin, Thomas Hueber, Xavier Alameda-Pineda
:
Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping. 662-673 - Mohamad Hasan Bahari
, Alexander Bertrand
, Marc Moonen:
Blind Sampling Rate Offset Estimation for Wireless Acoustic Sensor Networks Through Weighted Least-Squares Coherence Drift Estimation. 674-686 - Adam Kuklasinski, Simon Doclo
, Søren Holdt Jensen, Jesper Rindom Jensen:
Correction to "Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise". 687
Volume 25, Number 4, April 2017
- Sharon Gannot
, Emmanuel Vincent, Shmulik Markovich Golan, Alexey Ozerov:
A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation. 692-730 - Dongwen Ying, Ruohua Zhou
, Junfeng Li, Yonghong Yan:
Window-Dominant Signal Subspace Methods for Multiple Short-Term Speech Source Localization. 731-744 - Sean U. N. Wood, Jean Rouat, Stéphane Dupont
, Gueorgui Pironkov:
Blind Speech Separation and Enhancement With GCC-NMF. 745-755 - Constantin Spille, Birger Kollmeier, Bernd T. Meyer:
Combining Binaural and Cortical Features for Robust Speech Recognition. 756-767 - Yuma Koizumi, Kenta Niwa, Yusuke Hioka
, Kazunori Kobayashi, Hitoshi Ohmuro:
Informative Acoustic Feature Selection to Maximize Mutual Information for Collecting Target Sources. 768-779 - Takuya Higuchi, Nobutaka Ito, Shoko Araki
, Takuya Yoshioka, Marc Delcroix
, Tomohiro Nakatani:
Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR. 780-793 - Eita Nakamura, Kazuyoshi Yoshii
, Shigeki Sagayama:
Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices. 794-806 - Omid Ghahabi
, Javier Hernando:
Deep Learning Backend for Single and Multisession i-Vector Speaker Recognition. 807-817 - Penny Karanasou, Chunyang Wu, Mark J. F. Gales, Philip C. Woodland:
I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models. 818-828 - G. Aneeja, B. Yegnanarayana:
Extraction of Fundamental Frequency From Degraded Speech Using Temporal Envelopes at High SNR Frequencies. 829-838 - Seyyed Saeed Sarfjoo, Cenk Demiroglu, Simon King:
Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data. 839-851 - Yung-Yue Chen, Jia-Hao Zhang:
Background Noise Reduction Design for Dual Microphone Cellular Phones: Robust Approach. 852-862 - Liner Yang, Xinxiong Chen, Zhiyuan Liu
, Maosong Sun:
Improving Word Representations with Document Labels. 863-870 - Shiliang Zhang, Cong Liu, Hui Jiang, Si Wei, Li-Rong Dai, Yu Hu:
Nonrecurrent Neural Structure for Long-Term Dependence. 871-884 - Xuefeng Yang, Kezhi Mao:
Task Independent Fine Tuning for Word Embeddings. 885-894 - Huawei Chen:
Design of Robust Broadband Beamformers Using Worst-Case Performance Optimization: A Semidefinite Programming Approach. 895-907 - Sandro Cumani, Pietro Laface
:
Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition. 908-919
Volume 25, Number 5, May 2017
- Manu Airaksinen
, Tom Bäckström
, Paavo Alku
:
Quadratic Programming Approach to Glottal Inverse Filtering by Joint Norm-1 and Norm-2 Optimization. 929-939 - Ofer Schwartz, Sharon Gannot
, Emanuël A. P. Habets:
Multispeaker LCMV Beamformer and Postfilter for Source Separation and Noise Reduction. 940-951 - Dongmei Wang, Chengzhu Yu, John H. L. Hansen:
Robust Harmonic Features for Classification-Based Pitch Estimation. 952-964 - Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Bo Li
, Arun Narayanan, Ehsan Variani, Michiel Bacchiani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim:
Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition. 965-979 - Hanieh Khalilian, Ivan V. Bajic
, Rodney G. Vaughan:
A Simulation Study of a Three-Dimensional Sound Field Reproduction System for Immersive Communication. 980-995 - Andreas Franck, Wenwu Wang, Filippo Maria Fazi
:
Sparse ℓ1-Optimal Multiloudspeaker Panning and Its Relation to Vector Base Amplitude Panning. 996-1010 - Songbin Li, Yizhen Jia, C.-C. Jay Kuo
:
Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals. 1011-1022 - Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models. 1023-1034 - Navid Shokouhi, John H. L. Hansen:
Teager-Kaiser Energy Operators for Overlapped Speech Detection. 1035-1047 - Yi-Chin Huang, Chung-Hsien Wu
, Yan-You Chen, Ming-Ge Shie, Jhing-Fa Wang:
Personalized Spontaneous Speech Synthesis Using a Small-Sized Unsegmented Semispontaneous Speech. 1048-1060 - Jeongsoo Park, Jaeyoung Shin, Kyogu Lee
:
Exploiting Continuity/Discontinuity of Basis Vectors in Spectrogram Decomposition for Harmonic-Percussive Sound Separation. 1061-1074 - Xueliang Zhang, DeLiang Wang:
Deep Learning Based Binaural Speech Separation in Reverberant Environments. 1075-1084 - Masood Delfarah, DeLiang Wang:
Features for Masking-Based Monaural Speech Separation in Reverberant Conditions. 1085-1094 - Feiran Yang, Gerald Enzner
, Jun Yang
:
Statistical Convergence Analysis for Optimal Control of DFT-Domain Adaptive Echo Canceler. 1095-1106 - Takashi Nose, Yusuke Arao, Takao Kobayashi
, Komei Sugiura, Yoshinori Shiga:
Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis. 1107-1116 - Gergely Firtha, Péter Fiala, Frank Schultz
, Sascha Spors
:
Improved Referencing Schemes for 2.5D Wave Field Synthesis Driving Functions. 1117-1127 - Esteban Maestre
, Gary P. Scavone, Julius O. Smith III
:
Joint Modeling of Bridge Admittance and Body Radiativity for Efficient Synthesis of String Instrument Sound by Digital Waveguides. 1128-1139 - Gongping Huang
, Jacob Benesty, Jingdong Chen:
On the Design of Frequency-Invariant Beampatterns With Uniform Circular Microphone Arrays. 1140-1153 - Zdenek Prusa
, Péter Balázs
, Peter L. Søndergaard:
A Noniterative Method for Reconstruction of Phase From STFT Magnitude. 1154-1164
Volume 25, Number 6, June 2017
- Gaël Richard, Tuomas Virtanen
, Juan Pablo Bello
, Nobutaka Ono
, Hervé Glotin:
Introduction to the Special Section on Sound Scene and Event Analysis. 1169-1171 - Héctor A. Sánchez-Hevia
, David Ayllón, Roberto Gil-Pita
, Manuel Rosa-Zurera
:
Maximum Likelihood Decision Fusion for Weapon Classification in Wireless Acoustic Sensor Networks. 1172-1182 - Nithin Rao Koluguri, G. Nisha Meenakshi, Prasanta Kumar Ghosh:
Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection. 1183-1192 - Dan Stowell
, Emmanouil Benetos
, Lisa F. Gill
:
On-Bird Sound Recordings: Automatic Acoustic Recognition of Activities and Contexts. 1193-1206 - Brandon T. Carroll, Bradley M. Whitaker, Wayne Daley, David V. Anderson:
Outlier Learning via Augmented Frozen Dictionaries. 1207-1215 - Victor Bisot, Romain Serizel, Slim Essid, Gaël Richard:
Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification. 1216-1229 - Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson
, Mark D. Plumbley
:
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging. 1230-1241 - Rene Grzeszick, Axel Plinge, Gernot A. Fink:
Bag-of-Features Methods for Acoustic Event Detection and Classification. 1242-1252 - Alain Rakotomamonjy:
Supervised Representation Learning for Audio Scene Classification. 1253-1265 - Emmanouil Benetos
, Grégoire Lafay, Mathieu Lagrange, Mark D. Plumbley
:
Polyphonic Sound Event Tracking Using Linear Dynamical Systems. 1266-1277 - Huy Phan, Lars Hertel, Marco Maaß
, Philipp Koch, Radoslaw Mazur, Alfred Mertins:
Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks. 1278-1290 - Emre Çakir
, Giambattista Parascandolo, Toni Heittola, Heikki Huttunen
, Tuomas Virtanen
:
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection. 1291-1303 - Jens Schröder, Niko Moritz, Jörn Anemüller, Stefan Goetze
, Birger Kollmeier:
Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016. 1304-1314 - Wenjun Yang, Sridhar Krishnan:
Combining Temporal Features by Local Binary Pattern for Acoustic Scene Classification. 1315-1321 - David Dov, Ronen Talmon, Israel Cohen:
Multimodal Kernel Method for Activity Detection of Sound Sources. 1322-1334 - Keisuke Imoto
, Nobutaka Ono
:
Spatial Cepstrum as a Spatial Feature Using a Distributed Microphone Array for Acoustic Scene Analysis. 1335-1343 - Ivo Trowitzsch
, Johannes Mohr, Youssef Kashef, Klaus Obermayer:
Robust Detection of Environmental Sounds in Binaural Auditory Scenes. 1344-1356 - Abu Shafin Mohammad Mahdee Jameel, Shaikh Anowarul Fattah, Rajib Goswami
, Wei-Ping Zhu
, M. Omair Ahmad:
Noise Robust Formant Frequency Estimation Method Based on Spectral Model of Repeated Autocorrelation of Speech. 1357-1370 - Na Li, Man-Wai Mak
, Jen-Tzung Chien
:
DNN-Driven Mixture of PLDA for Robust Speaker Verification. 1371-1383 - Kai Wu, Vaninirappuputhenpurayil Gopalan Reju
, Andy W. H. Khong, Shu Ting Goh
:
Swarm Intelligence Based Particle Filter for Alternating Talker Localization and Tracking Using Microphone Arrays. 1384-1397
Volume 25, Number 7, July 2017
- Yu-An Chen, Ju-Chiang Wang, Yi-Hsuan Yang, Homer H. Chen:
Component Tying for Mixture Model Adaptation in Personalization of Music Emotion Recognition. 1409-1420 - Hossein Zeinali
, Hossein Sameti, Lukás Burget
:
HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification. 1421-1435 - Xinzhou Xu, Jun Deng, Nicholas Cummins
, Zixing Zhang, Chen Wu, Li Zhao, Björn W. Schuller
:
A Two-Dimensional Framework of Multiple Kernel Subspace Learning for Recognizing Emotion in Speech. 1436-1449 - Mandy Korpusik, James R. Glass:
Spoken Language Understanding for a Nutrition Dialogue System. 1450-1461 - Mahmoud Fakhry
, Piergiorgio Svaizer
, Maurizio Omologo
:
Audio Source Separation in Reverberant Environments Using β-Divergence-Based Nonnegative Factorization. 1462-1476 - Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot
:
Semi-Supervised Source Localization on Multiple Manifolds With Distributed Microphones. 1477-1491 - Donald S. Williamson
, DeLiang Wang:
Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising. 1492-1501 - Liang Lu, Steve Renals
:
Small-Footprint Highway Deep Neural Networks for Speech Recognition. 1502-1511 - Ina Kodrasi
, Simon Doclo
:
Signal-Dependent Penalty Functions for Robust Acoustic Multi-Channel Equalization. 1512-1525 - Jung-Hee Kim, Jin Kim, Jae Hyeon Jeon, Sang Won Nam
:
Delayless Individual-Weighting-Factors Sign Subband Adaptive Filter With Band-Dependent Variable Step-Sizes. 1526-1534 - Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Gender Mixture Detection Approach to Unsupervised Single-Channel Speech Separation Based on Deep Neural Networks. 1535-1546 - Giacomo Vairetti
, Enzo De Sena
, Michael Catrysse, Søren Holdt Jensen, Marc Moonen, Toon van Waterschoot:
A Scalable Algorithm for Physically Motivated and Sparse Approximation of Room Impulse Responses With Orthonormal Basis Functions. 1547-1561
Volume 25, Number 8, August 2017
- Francis Stevens, Damian T. Murphy, Lauri Savioja, Vesa Välimäki
:
Modeling Sparsely Reflecting Outdoor Acoustic Scenes Using the Waveguide Web. 1566-1578 - Ferdinando Olivieri
, Filippo Maria Fazi
, Simone Fontana, Dylan Menzies, Philip Arthur Nelson:
Generation of Private Sound With a Circular Loudspeaker Array and the Weighted Pressure Matching Method. 1579-1591 - Samy Elshamy
, Nilesh Madhu
, Wouter Tirry, Tim Fingscheidt
:
Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation. 1592-1605 - Paavo Alku
, Rahim Saeidi
:
The Linear Predictive Modeling of Speech From Higher-Lag Autocorrelation Coefficients Applied to Noise-Robust Speaker Recognition. 1606-1617 - Cheng Pang, Hong Liu, Jie Zhang, Xiaofei Li:
Binaural Sound Localization Based on Reverberation Weighting and Generalized Parametric Mapping. 1618-1632 - Somanath Pradhan
, Vinal Patel
, Dipen Somani, Nithin V. George
:
An Improved Proportionate Delayless Multiband-Structured Subband Adaptive Feedback Canceller for Digital Hearing Aids. 1633-1643 - Szymon Drgas
, Tuomas Virtanen
, Jörg Lücke, Antti Hurmalainen:
Binary Non-Negative Matrix Deconvolution for Audio Dictionary Learning. 1644-1656 - Fatemeh Saki, Nasser Kehtarnavaz:
Real-Time Unsupervised Classification of Environmental Noise Signals. 1657-1667 - Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen:
Automatic Sentiment Detection in Naturalistic Audio. 1668-1679 - Ofer Schwartz, Sharon Gannot
, Emanuël A. P. Habets:
Cramér-Rao Bound Analysis of Reverberation Level Estimators for Dereverberation and Noise Reduction. 1680-1693 - Seyran Khademi
, Richard C. Hendriks, W. Bastiaan Kleijn
:
Intelligibility Enhancement Based on Mutual Information. 1694-1708 - Yuta Hatano, Chuang Shi
, Yoshinobu Kajikawa:
Compensation for Nonlinear Distortion of the Frequency Modulation-Based Parametric Array Loudspeaker. 1709-1717 - Yu-Ren Chien, Daryush D. Mehta, Jón Guðnason, Matías Zanartu
, Thomas F. Quatieri:
Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer. 1718-1730
Volume 25, Number 9, September 2017
- Jakob Abeßer
, Gerald Schuller:
Instrument-Centered Music Transcription of Solo Bass Guitar Recordings. 1741-1750 - Thomas Le Cornu, Ben Milner:
Generating Intelligible Audio Speech From Visual Speech. 1751-1761 - Lemao Liu, Atsushi Fujita, Masao Utiyama, Andrew M. Finch, Eiichiro Sumita:
Translation Quality Estimation Using Only Bilingual Corpora. 1762-1772 - Emad M. Grais, Gerard Roma
, Andrew J. R. Simpson, Mark D. Plumbley
:
Two-Stage Single-Channel Audio Source Separation Using Deep Neural Networks. 1773-1783 - Giuliano Bernardi
, Toon van Waterschoot, Jan Wouters
, Marc Moonen:
Adaptive Feedback Cancellation Using a Partitioned-Block Frequency-Domain Kalman Filter Approach With PEM-Based Signal Prewhitening. 1784-1798 - Vinal Patel
, Jordan Cheer
, Nithin V. George
:
Modified Phase-Scheduled-Command FxLMS Algorithm for Active Sound Profiling. 1799-1808 - Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori:
Denoised Bottleneck Features From Deep Autoencoders for Telephone Conversation Analysis. 1809-1820 - Nikolaos Stefanakis, Despoina Pavlidi
, Athanasios Mouchtaris:
Perpendicular Cross-Spectra Fusion for Sound Source Localization With a Planar Microphone Array. 1821-1835 - Takenori Yoshimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
Simultaneous Optimization of Multiple Tree-Based Factor Analyzed HMM for Speech Synthesis. 1836-1845 - Eita Nakamura, Kazuyoshi Yoshii
, Simon Dixon:
Note Value Recognition for Piano Transcription Using Markov Random Fields. 1846-1858
Volume 25, Number 10, October 2017
- Xiaohai Tian, Siu Wa Lee, Zhizheng Wu, Eng Siong Chng
, Haizhou Li
:
An Exemplar-Based Approach to Frequency Warping for Voice Conversion. 1863-1876 - Siying Wang, Sebastian Ewert
, Simon Dixon:
Identifying Missing and Extra Notes in Piano Recordings Using Score-Informed Dictionary Learning. 1877-1889 - Sandro Cumani, Pietro Laface:
Joint Estimation of PLDA and Nonlinear Transformations of Speaker Vectors. 1890-1900 - Morten Kolbaek
, Dong Yu, Zheng-Hua Tan, Jesper Jensen:
Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks. 1901-1913 - Cheng-Tao Chung, Cheng-Yu Tsai, Chia-Hsiang Liu, Lin-Shan Lee:
Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection. 1914-1928 - Niccolò Antonello, Enzo De Sena
, Marc Moonen, Patrick A. Naylor
, Toon van Waterschoot:
Room Impulse Response Interpolation Using a Sparse Spatio-Temporal Representation of the Sound Field. 1929-1941 - Yanmin Qian, Nanxin Chen, Heinrich Dinkel, Zhizheng Wu:
Deep Feature Engineering for Noise Robust Spoofing Detection. 1942-1955 - Sina Hafezi
, Alastair H. Moore, Patrick A. Naylor
:
Augmented Intensity Vectors for Direction of Arrival Estimation in the Spherical Harmonic Domain. 1956-1968 - Byeongho Jo, Jung-Woo Choi
:
Spherical Harmonic Smoothing for Localizing Coherent Sound Sources. 1969-1984 - Emma Jokinen, Ulpu Remes
, Paavo Alku
:
Intelligibility Enhancement of Telephone Speech Using Gaussian Process Regression for Normal-to-Lombard Spectral Tilt Conversion. 1985-1996 - Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot
:
Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization. 1997-2012 - Marc Arnela
, Oriol Guasch
:
Finite Element Synthesis of Diphthongs Using Tuned Two-Dimensional Vocal Tracts. 2013-2023 - Deepak Baby
, Hugo Van hamme
:
Joint Denoising and Dereverberation Using Exemplar-Based Sparse Representations and Decaying Norm Constraint. 2024-2035
Volume 25, Number 11, November 2017
- Qinghua Huang, Lin Zhang, Yong Fang:
Two-Stage Decoupled DOA Estimation Based on Real Spherical Harmonics for Spherical Arrays. 2045-2058 - Tomoki Hayashi, Shinji Watanabe
, Tomoki Toda
, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
Duration-Controlled LSTM for Polyphonic Sound Event Detection. 2059-2070 - Monisankha Pal, Goutam Saha
:
Spectral Mapping Using Prior Re-Estimation of i-Vectors and System Fusion for Voice Conversion. 2071-2084 - Seppo Enarvi, Peter Smit
, Sami Virpioja
, Mikko Kurimo:
Automatic Speech Recognition With Very Large Conversational Finnish and Estonian Vocabularies. 2085-2097 - Hannah Muckenhirn, Pavel Korshunov, Mathew Magimai-Doss, Sébastien Marcel
:
Long-Term Spectral Statistics for Voice Presentation Attack Detection. 2098-2111 - Brian Hamilton, Stefan Bilbao:
FDTD Methods for 3-D Room Acoustics Simulation With High-Order Accuracy in Space and Time. 2112-2124 - Pejman Mowlaee
, Martin Blass, W. Bastiaan Kleijn
:
New Results in Modulation-Domain Single-Channel Speech Enhancement. 2125-2137 - Dylan Menzies, Filippo Maria Fazi
:
Decoding and Compression of Channel and Scene Objects for Spatial Audio. 2138-2151 - Eunwoo Song, Frank K. Soong, Hong-Goo Kang:
Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems. 2152-2161 - Pulkit Sharma, Vinayak Abrol, Anil Kumar Sao:
Deep-Sparse-Representation-Based Features for Speech Recognition. 2162-2175 - Iynkaran Natgunanathan, Yong Xiang, Guang Hua, Gleb Beliakov, John Yearwood:
Patchwork-Based Multilayer Audio Watermarking. 2176-2187 - Chengzhu Yu, John H. L. Hansen:
Active Learning Based Constrained Clustering For Speaker Diarization. 2188-2198 - Emil Solsbæk Ottosen
, Monika Dörfler:
A Phase Vocoder Based on Nonstationary Gabor Frames. 2199-2208 - Boaz Schwartz, Sharon Gannot
, Emanuël A. P. Habets:
Two Model-Based EM Algorithms for Blind Source Separation in Noisy Environments. 2209-2222 - Maja Taseska, Emanuël A. P. Habets:
Nonstationary Noise PSD Matrix Estimation for Multichannel Blind Speech Extraction. 2223-2236 - Bruno Di Giorgi
, Simon Dixon, Massimiliano Zanoni, Augusto Sarti
:
A Data-Driven Model of Tonal Chord Sequence Complexity. 2237-2250 - Nikolaos Stefanakis, Despoina Pavlidi, Athanasios Mouchtaris:
Corrections to "Perpendicular Cross-Spectra Fusion for Sound Source Localization With a Planar Microphone Array". 2251
Volume 25, Number 12, December 2017
- Tanja Schultz
, Thomas Hueber, Dean J. Krusienski, Jonathan S. Brumberg
:
Introduction to the Special Issue on Biosignal-Based Spoken Communication. 2254-2256 - Tanja Schultz
, Michael Wand, Thomas Hueber, Dean J. Krusienski, Christian Herff
, Jonathan S. Brumberg
:
Biosignal-Based Spoken Communication: A Survey. 2257-2271 - Christopher Dromey, Katherine M. Black:
Effects of Laryngeal Activity on Articulation. 2272-2280 - Michal Borsky
, Daryush D. Mehta
, Jarrad H. Van Stan, Jón Guðnason
:
Modal and Nonmodal Voice Quality Classification Using Acoustic and Electroglottographic Features. 2281-2291 - Alborz Rezazadeh Sereshkeh
, Robert E. Trott
, Aurélien Bricout
, Tom Chau
:
EEG Classification of Covert Speech Using Regularized Neural Networks. 2292-2300 - Reza Sahraeian, Dirk Van Compernolle:
Crosslingual and Multilingual Speech Recognition Based on the Speech Manifold. 2301-2312 - Dorde T. Grozdic, Slobodan T. Jovicic:
Whispered Speech Recognition Using Deep Denoising Autoencoder and Inverse Filtering. 2313-2322 - Myung Jong Kim
, Beiming Cao, Ted Mau
, Jun Wang
:
Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network. 2323-2336 - Patrick Lumban Tobing
, Kazuhiro Kobayashi, Tomoki Toda
:
Articulatory Controllable Speech Modification Based on Statistical Inversion and Production Mappings. 2337-2350 - Ingmar Steiner
, Sébastien Le Maguer
, Alexander Hewer:
Synthesis of Tongue Motion and Acoustics From Text Using a Multimodal Articulatory Database. 2351-2361 - José A. González
, Lam Aun Cheah, Angel M. Gomez
, Phil D. Green, James M. Gilbert
, Stephen R. Ell, Roger K. Moore
, Ed Holdsworth
:
Direct Speech Reconstruction From Articulatory Sensor Data by Machine Learning. 2362-2374 - Matthias Janke, Lorenz Diener:
EMG-to-Speech: Direct Generation of Speech From Facial Electromyographic Signals. 2375-2385 - Geoffrey S. Meltzner, James T. Heaton, Yunbin Deng, Gianluca De Luca, Serge H. Roy, Joshua C. Kline:
Silent Speech Recognition as an Alternative Communication Device for Persons With Laryngectomy. 2386-2398 - Fei Chen
, Lan Wang, Hui Chen, Gang Peng
:
Investigations on Mandarin Aspiratory Animations Using an Airflow Model. 2399-2409 - Wayne Xiong
, Jasha Droppo
, Xuedong Huang
, Frank Seide, Michael L. Seltzer
, Andreas Stolcke
, Dong Yu
, Geoffrey Zweig:
Toward Human Parity in Conversational Speech Recognition. 2410-2423 - Biao Zhang, Deyi Xiong
, Jinsong Su, Hong Duan:
A Context-Aware Recurrent Encoder for Neural Machine Translation. 2424-2432 - Afsaneh Asaei, Milos Cernak, Hervé Bourlard:
Perceptual Information Loss due to Impaired Speech Production. 2433-2443 - Ning Ma
, Tobias May
, Guy J. Brown
:
Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localization of Multiple Sources in Reverberant Environments. 2444-2453

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.