Melody extraction from music using modified group delay functions

被引:4
作者
Rajan R. [1 ]
Misra M. [2 ]
Murthy H.A. [1 ]
机构
[1] Department of Computer Science and Engineering, Indian Institute of Technology, Madras, Chennai
[2] Center for Computer Research in Music and Acoustics, Stanford University, Stanford, CA
关键词
Group delay; Modified group delay-source; Modified group delay-system; Pitch extraction for music;
D O I
10.1007/s10772-017-9397-1
中图分类号
学科分类号
摘要
Modified group delay based algorithms for estimation of melodic pitch sequences from heterphonic/polyphonic music are discussed in this paper. Two different variants of the modified group delay function are proposed, namely, (a) system based—MODGD (Direct) and (b) source based—MODGD (Source). In (a) the standard modified group delay function (MODGDF) is used to estimate prominent melodic pitch (f0), which appears like a low frequency formant in the MODGDF spectrum. In (b), the power spectrum of the signal is first flattened to emphasise the source. The flattened power spectrum behaves like a sinusoid in noise, the frequency of the sinusoid being related to the pitch frequency. The modified group delay function of this signal produces peaks at T0, 2 T0, … , where T0=1f0. Continuity constraints in a dynamic programming framework are imposed across frames to reduce octave errors. Sudden changes in pitch are accommodated by changing the frame size dynamically using a multi-resolution framework. The performance of the proposed systems was evaluated on four datasets: ADC-2004, LabROSA, MIREX-2008 and Carnatic music dataset. The performance of the proposed approaches demonstrate the potential of the group delay based methods for melody extraction. © 2017, Springer Science+Business Media New York.
引用
收藏
页码:185 / 204
页数:19
相关论文
共 44 条
  • [1] Arora V., Behera L., On-line melody extraction from polyphonic audio using harmonic cluster tracking, IEEE Transactions on Audio Speech and Language Processing, 21, 3, pp. 520-530, (2013)
  • [2] Bello J.P., Towards the automated analysis of simple polyphonic music: A knowledge based approach. Ph.D, Diss., (2003)
  • [3] Brossier P.M., Fast melody extraction using aubio(brossier), mirex-2005, In 4th Music information retrieval evaluation eXchange (MIREX), extended abstract, pp. 325-333, (2005)
  • [4] Singing melody extraction in polyphonic music by harmonic tracking, In Proceedings of international society for music information retrieval (International Society for Music Information Retrieval conference), pp. 373-374, (2007)
  • [5] Dressler K., An auditory streaming approach for melody extraction from polyphonic music, In Proceedings of international society for music information retrieval conference, pp. 19-24, (2011)
  • [6] Durrieu J., Source/filter model for unsupervised main melody extraction from polyphonic audio signals, IEEE transactions on audio, speech, and language processing, pp. 564-575, (2010)
  • [7] A real-time music scene description system: Detecting melody and bass lines in audio signals, In Working notes of the IJCAI-99 workshop on computational auditory scene analysis, pp. 31-40, (1999)
  • [8] Hsu C.-L., Chen L.-Y., Jang J.-S.R., Li H.-J., Singing pitch extraction fom monaural polyphonic songs by contextuual audio modeling and singing harmonic enhancement, In Proceedings of the 10th international society for music information retrieval conference, pp. 201-206, (2009)
  • [9] Hsu C.L., Jang J.S., Singing pitch extraction by voice vibrato/tremolo estimation and instrument partial deletion, In Proceedings of international society for music information retrieval (International Society for Music Information Retrieval Conference), pp. 525-530, (2010)
  • [10] Hsu C.-L., Wang D., Jang J.-S.R., Hu K., A tandem algorithm for singing pitch extraction and voice separation from music accompaniment, IEEE Transactions on Audio, Speech and Langauge Processing, 20, 5, pp. 1482-1491, (2012)