流行音乐主旋律提取技术综述

doi:10.11896/j.issn.1002-137X.2017.05.001

摘要/Abstract

摘要： 旋律是最重要的音乐要素之一,多应用于音乐内容分析、音乐创作、音乐教育、抄袭检测等方面。主旋律提取旨在从一段音乐中自动估计对应于主旋律单音音符序列的音高或基频。流行音乐一般属于复杂的多音音乐,因此主旋律提取面临着许多挑战。综述了主旋律提取的研究背景,将主旋律提取的典型方法分别从基于音高显著度、基于声源分离以及其他方法3个类别进行了阐述,最后介绍了主旋律提取的评价指标以及研究进展。

关键词: 主旋律提取,多音音乐,音乐信息检索

Abstract: Melody is one of the most important elements of music,with many direct and indirect applications in music content analysis,music creation,music education and protection of music intellectual properties.Main melody extraction aims to produce a sequence of frequency values corresponding to the pitches of the dominant melody from a musical recording.Due to the complexity and specificity of pop music,main melody extraction from pop music turns out to be highly challenging.This paper reviewed research works of main melody extraction,classified and summarized the methodsused for main melody extraction.Finally,evaluation methodologies for main melody extraction were presented.

Key words: Main melody extraction,Polyphonic music,Music information retrieval

李伟,冯相宜,吴益明,张旭龙. 流行音乐主旋律提取技术综述[J]. 计算机科学, 2017, 44(5): 1-5. https://doi.org/10.11896/j.issn.1002-137X.2017.05.001

LI Wei, FENG Xiang-yi, WU Yi-ming and ZHANG Xu-long. Review on Main Melody Extraction from Pop Music[J]. Computer Science, 2017, 44(5): 1-5. https://doi.org/10.11896/j.issn.1002-137X.2017.05.001

参考文献

[1] KREIMAN J,SIDTIS D.Foundations of Voice Studies:An Interdisciplinary Approach to Voice Production and Perception[M].Wiley-Blackwell,2011:1-24.
[2] POLINER G E,ELLIS D P W,EHMANN A F,et al.Melody Transcription From Music Audio:Approaches and Evaluation[J].IEEE Transactions on Audio Speech & Language Proces-sing,2007,15(4):1247-1256.
[3] MOORE B C.An Introduction to the Psychology of Hearing[J].General Information,1997,27(1):3-10.
[4] KLAPURI A P.Automatic Music Transcription as We Know it Today[J].Journal of New Music Research,2004,33(3):269-282.
[5] KLAPURI A,DAVY M,et al.Signal processing methods formusic transcription[M].Springer Science & Business Media,2007.
[6] SALAMON J,GOMEZ E,ELLIS D P W,et al.Melody extraction from polyphonic music signals:Approaches,applications,and challenges[J].Signal Processing Magazine,IEEE,2014,31(2):118-134.
[7] GMEZ E,CAADAS-QUESADA F J,SALAMON J,et al.Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing[C]∥ISMIR.2012:601-606.
[8] KODURI G K,SERRERR JULI J,SERRA X.Characteri-zation of intonation in carnatic music by parametrizing pitch histograms[C]∥Proceedings of the 13th International Society for Music Information Retrieval Conference.Porto,Portugal.2012:95-98.
[9] PIKRAKIS A,GMEZ F,OORAMAS S,et al.Tracking Melo-dic Patterns in Flamenco Singing by Analyzing Polyphonic Music Recordings[C]∥ISMIR.2012:421-426.
[10] SALAMON J,SERRA J,GMEZ E.Tonal representations for music retrieval:from version identification to query-by-humming[J].International Journal of Multimedia Information Retrieval,2013,2(1):45-58.
[11] FOUCARD R,DURRIEU J L,LAGRANGE M,et al.Multimodal similarity between musical streams for cover version detection[C]∥2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP).IEEE,2010:5514-5517.
[12] SALAMON J,ROCHA B,GMEZ E.Musical genre classification using melody features extracted from polyphonic music signals[C]∥2012 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2012:81-84.
[13] DURRIEU J L,RICHARD G,DAVID B.An iterative approach to monaural musical mixture de-soloing[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing,2009(ICASSP 2009).IEEE,2009:105-108.
[14] MESAROS A,VIRTANEN T,KLAPURI A.Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods[C]∥ISMIR.2007:375-378.
[15] Hess W.Pitch determination of speech signals:algorithms anddevices[M].Springer Science & Business Media,2012.
[16] GMEZ E,KLAPURI A,MEUDIC B.Melody description and extraction in the context of music content processing[J].Journal of New Music Research,2003,32(1):23-40.
[17] KLAPURI A.Signal processing methods for the automatic transcription of music[M].Finland:Tampere University of Techno-logy,2004.
[18] SUNDBERG J.ROSSING T D.The Science of the Singing Voi-ce [J].Journal of the Acoustical Society of America,1990,87(1):462-463.
[19] SALAMON J,GMEZ E.Melody extraction from polyphonicmusic signals using pitch contour characteristics[J].IEEE Transactions on Audio,Speech,and Language Processing,2012,20(6):1759-1770.
[20] HSU C L,JANG J S R.Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion[C]∥ISMIR.2010:525-530.
[21] YEH T C,WU M J,JANG J S R,et al.A hybrid approach to singing pitch extraction based on trend estimation and hidden Markov models[C]∥2012 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2012:457-460.
[22] RYYNNEN M P,KLAPURI A P.Automatic transcription of melody,bass line,and chords in polyphonic music[J].Computer Music Journal,2008,32(3):72-86.
[23] MAROLT M.Gaussian Mixture Models For Extraction Of Melodic Lines From Audio Recordings[C]∥ISMIR.2004.
[24] RAO V,RAO P.Vocal melody extraction in the presence ofpitched accompaniment in polyphonic music[J].IEEE Transactions on Audio,Speech,and Language Processing,2010,18(8):2145-2154.
[25] ARORA V,BEHERA L.On-line melody extraction from polyphonic audio using harmonic cluster tracking[J].IEEE Transactions on Audio,Speech,and Language Processing,2013,21(3):520-530.
[26] SRIDEVI S H,GULHANE S R.Melody Extraction from Polyphonic Music Signal using STFT and Fanchirp Transform[C]∥International Journal of Engineering Research and Technology.2015.
[27] GOTO M.A real-time music-scene-description system:Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals[J].Speech Communication,2004,43(4):311-329.
[28] CANCELA P.Tracking melody in polyphonic audio[C]∥Proc.of Music Information Retrieval Evaluation eXchange,2008.
[29] DRESSLER K.Sinusoidal extraction using an efficient imple-mentation of a multi-resolution FFT[C]∥Proc.of 9th Int.Conf.on Digital Audio Effects (DAFx-06).2006:247-252.
[30] JO S,JOO S,YOO C D.Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model[C]∥Interspeech.2010:2902-2905.
[31] PAIVA R P,MENDES T,CARDOSO A.Melody detection in polyphonic musical signals:Exploiting perceptual rules,note salience,and melodic smoothness[J].Computer Music Journal,2006,30(4):80-98.
[32] DRESSLER K.Pitch estimation by the pair-wise evaluation of spectral peaks[C]∥Audio Engineering Society Conference:42nd International Conference:Semantic Audio.Audio Engineering Society,2011.
[33] DRESSLER K.An Auditory Streaming Approach for MelodyExtraction from Polyphonic Music[C]∥ISMIR.2011:19-24.
[34] BITTNER R M,SALAMON J,ESSID S,et al.Melody Extraction By Contour Classification[C]∥ISMIR.2015.
[35] DURRIEU J L,RICHARD G,DAVID B,et al.Source/filtermodel for unsupervised main melody extraction from polyphonic audio signals[J].IEEE Transactions on Audio,Speech,and Language Processing,2010,18(3):564-575.
[36] TACHIBANA H,ONO T,ONON,et al.Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source[C]∥2010 IEEE International Conference on Acoustics speech and signal processing (ICASSP).IEEE,2010:425-428.
[37] HUANG P S,CHEN S D,SMARAGDIS P,et al.Singing-voice separation from monaural recordings using robust principal component analysis[C]∥2012 IEEE International Conference on Acoustics,Speech and Signal Proces-sing (ICASSP).IEEE,2012:57-60.
[38] RAFII Z,PARDO B.Repeating pattern extraction technique (RE-PET):A simple method for music/voice separation[J].IEEE Transactions on Audio,Speech,and Language Processing,2013,21(1):73-84.
[39] LIUTKUS A,RAFII Z,BADEAU deau R,et al.Adaptive filtering for music/voice separation exploiting the repeating musical structure[C]∥2012 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2012:53-56.
[40] POLINER G E,ELLIS D P W.A classification approach to me-lody transcription[C]∥ISMIR 2005.2005:161-166.
[41] SUTTON C,VINCENT E,PLUMBLEY M,et al.Transcription of vocal melodies using voice characteristics and algorithm fusion[C]∥Proc.of Music Information Retrieval Evaluation eXchange (MIREX).2006.
[42] ELLIS D P W,POLINER G E.Identifyingcover songs’ withchroma features and dynamic programming beat tracking[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing,2007(ICASSP 2007).IEEE,2007:1429-1432.
[43] SERRA J,GMEZ E,HERRERA P,et al.Chroma binary similarity and local alignment applied to cover song identification[J].IEEE Transactions on Audio,Speech,and Language Processing,2008,16(6):1138-1151.
[44] ZHU B,LI W,WANG Z,et al.A novel audio fingerprinting me-thod robust to time scale modification and pitch shifting[C]∥Proceedings of the International Conference on Multimedia.ACM,2010:987-990.
[45] ZHANG J J.Harmonic overtone detection based main melody extraction[D].Shanghai:Shanghai Jiao Tong University,2007.(in Chinese) 张俊杰.基于和谐泛音检测的主旋律提取技术[D].上海:上海交通大学,2007.
[46] SONG Y Y.An approach for music melody extraction based on underdetermined single-source speech separation[D].Beijing:Beijing University of Posts and Telecommunications,2012.(in Chinese) 宋岳阳.基于单源欠定语音分离的音乐主旋律提取方法研究[D].北京:北京邮电大学,2012.
[47] ZHANG M M.Melody extraction from singing voice ofpolyphonic music[D].Jinan:Shandong University,2015.(in Chinese) 张萌萌.复合音乐中歌声旋律的提取[D].济南:山东大学,2015.
[48] BOSCH J J,BITTNER R M,SALAMON J,et al.A Comparison of Melody Extraction Methods Based on Source-Filter Modelling[C]∥Proceedings of International Society for Music Information Retrieval Conference (ISMIR).2016:571-577.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed