Classification of Vocal and Non-vocal Regions from Audio Songs using Spectral Features and Pitch Variations

被引:0
作者
Murthy, Y. V. Srinivasa [1 ]
Koolagudi, Shashidhar G. [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept CSE, Surathkal 575025, India
来源
2015 IEEE 28TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE) | 2015年
关键词
Mel-frequency cepstral coefficients; Pitch; Jitter; Shimmer; Artificial neural networks; Vocal regions; Non-vocal regions; ARTIFICIAL NEURAL-NETWORKS; MUSIC;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this work, an effort has been made to identify vocal and non-vocal regions from a given song using signal processing techniques and machine learning algorithm. Initially spectral features like mel-frequency cepstral coefficients (MFCCs) are used to develop the baseline system. Statistical values of pitch, jitter and shimmer are considered to improve performance of the system. Artificial neural networks (ANNs) are used to capture the characteristics of vocal and non-vocal segments of the songs. The experiment is conducted on 60 vocal and 60 non-vocal clips extracted from Telugu albums. 11-point moving window is used to ensure the continuity of vocal and non-vocal segments, thus improving the accuracy of system. With this approach system achieves 85.59% accuracy for vocal and 88.52% for non-vocal segment classification.
引用
收藏
页码:1271 / 1276
页数:6
相关论文
共 5 条
  • [1] Classification of vocal and non-vocal segments in audio clips using genetic algorithm based feature selection (GAFS)
    Murthy, Y. V. Srinivasa
    Koolagudi, Shashidhar G.
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 106 : 77 - 91
  • [2] Research Article Contributions of Forward-Focused Voice to Audio-Vocal Feedback Measured Using Nasal Accelerometry and Power Spectral Analysis of Vocal Fundamental Frequency
    Lee, Shao-Hsuan
    Torng, Pao-Chuan
    Lee, Guo-She
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2022, 65 (05): : 1751 - 1766
  • [3] Dravidian language classification from speech signal using spectral and prosodic features
    Koolagudi S.G.
    Bharadwaj A.
    Srinivasa Murthy Y.V.
    Reddy N.
    Rao P.
    International Journal of Speech Technology, 2017, 20 (4) : 1005 - 1016
  • [4] Singer identification using perceptual features and cepstral coefficients of an audio signal from Indian video songs
    Ratanpara, Tushar
    Patel, Narendra
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015,
  • [5] Singer identification using perceptual features and cepstral coefficients of an audio signal from Indian video songs
    Tushar Ratanpara
    Narendra Patel
    EURASIP Journal on Audio, Speech, and Music Processing, 2015