Robust audio fingerprinting based on GammaChirp frequency cepstral coefficients and chroma

被引:3
|
作者
Chen, N. [1 ]
Xiao, H. D. [2 ]
Zhu, J. [3 ]
机构
[1] E China Univ Sci & Technol, Sch Informat Sci & Technol, Shanghai 200237, Peoples R China
[2] Chinese Acad Sci, Shanghai Adv Res Inst, Shanghai 201210, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
D O I
10.1049/el.2013.3554
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A novel auditory feature that combines an auditory model and music theory is proposed for audio fingerprinting. First, the input audio is filtered by a GammaChirp (GC) filterbank to model the cochlear frequency selectivity. Then, the output of the filterbank is downsampled and decorrelated by a discrete cosine transform to obtain the GammaChirp frequency cepstral coefficients (GCFCCs). Next, some lowest order GCFCCs are projected onto the chroma to characterise both melodic and harmonic information of the input. Finally, non-negative matrix factorisation is applied to the chroma matrix to reduce its dimension while maintaining its discriminative power. The experimental results illustrate that the proposed scheme achieves a stabler identification rate and lower computational complexity than the schemes based on the Mel-frequency cepstral coefficients. © The Institution of Engineering and Technology 2014.
引用
收藏
页码:241 / U174
页数:2
相关论文
共 50 条
  • [1] Power Normalized Gammachirp Cepstral (PNGC) coefficients-based approach for robust speaker recognition
    Zouhir, Youssef
    Zarka, Mohamed
    Supervision, Kais Ouni
    APPLIED ACOUSTICS, 2023, 205
  • [2] ROBUST FREQUENCY-BASED AUDIO FINGERPRINTING
    Dupraz, Elsa
    Richard, Gael
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 281 - 284
  • [3] A deep learning approach for robust speaker identification using chroma energy normalized statistics and mel frequency cepstral coefficients
    Abraham, J. V. Thomas
    Khan, A. Nayeemulla
    Shahina, A.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 26 (3) : 579 - 587
  • [4] A deep learning approach for robust speaker identification using chroma energy normalized statistics and mel frequency cepstral coefficients
    J. V. Thomas Abraham
    A. Nayeemulla Khan
    A. Shahina
    International Journal of Speech Technology, 2023, 26 : 579 - 587
  • [5] A robust audio fingerprinting system based on duplex frequency band detection
    Son H.
    Lee S.-P.
    Transactions of the Korean Institute of Electrical Engineers, 2020, 69 (01): : 120 - 126
  • [6] QUAD-BASED AUDIO FINGERPRINTING ROBUST TO TIME AND FREQUENCY SCALING
    Sonnleitner, Reinhard
    Widmer, Gerhard
    DAFX-14: 17TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, 2014, : 173 - 180
  • [7] SWMAT: Mel-frequency cepstral coefficients-based memory fingerprinting for IoT devices
    Vijayakanthan, Ramyapandian
    Ahmed, Irfan
    Ali-Gombe, Aisha
    COMPUTERS & SECURITY, 2023, 132
  • [8] Audio bandwidth extension based on temporal smoothing cepstral coefficients
    Liu, Xin
    Bao, Chang-Chun
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [9] Audio bandwidth extension based on temporal smoothing cepstral coefficients
    Xin Liu
    Chang-Chun Bao
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [10] Choosing an Accurate number of Mel Frequency Cepstral Coefficients for Audio Classification Purpose
    Grama, Lacrimioara
    Rusu, Corneliu
    PROCEEDINGS OF THE 10TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, 2017, : 225 - 230