Bird Call Classification Using DNN-Based Acoustic Modelling

被引:4
作者
Rajan, Rajeev [1 ,2 ]
Johnson, Jisna [1 ,2 ]
Kareem, Noumida Abdul [1 ,2 ]
机构
[1] Coll Engn, Dept Elect & Commun Engn, Thiruvananthapuram, Kerala, India
[2] APJ Abdul Kalam Technol Univ, Thiruvananthapuram, Kerala, India
关键词
Hidden Markov model; Gaussian mixture model; Deep neural network; Convolutional neural network;
D O I
10.1007/s00034-021-01896-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Bird call recognition using deep neural network-hidden Markov model (DNN-HMM)-based transcription is proposed. The work is an attempt to adapt the human speech recognition framework for bird call classification through transcription approach. Initially, the phone transcriptions are generated using CMU-Sphinx, and lexicons are modified using group delay-based segmentation. Later, bird call transcription is implemented using hybrid DNN-HMM framework through DNN-based acoustic modelling. During the DNN-based acoustic modelling, mel-frequency cepstral coefficient features (MFCCs) are computed and experimented with monophone models, triphone models, followed by linear discriminative analysis and maximum likelihood linear transform. The transcribed phonemes are corrected using context-based rules in the final phase. The proposed approach is evaluated on a dataset that consists of ten species with 563 audio tracks. The hybrid DNN-HMM approach outperforms the convolutional neural network and long short-term memory framework with an accuracy of 94.46%.
引用
收藏
页码:2669 / 2680
页数:12
相关论文
共 50 条
  • [31] Towards breaking DNN-based audio steganalysis with GAN
    Wang, Jie
    Wang, Rangding
    Dong, Li
    Yan, Diqun
    Zhang, Xueyuan
    Lin, Yuzhen
    INTERNATIONAL JOURNAL OF AUTONOMOUS AND ADAPTIVE COMMUNICATIONS SYSTEMS, 2021, 14 (04) : 371 - 383
  • [32] A study of speaker adaptation for DNN-based speech synthesis
    Wu, Zhizheng
    Swietojanski, Pawel
    Veaux, Christophe
    Renals, Steve
    King, Simon
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 879 - 883
  • [33] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
    Abdullah, Salinna
    Zamani, Majid
    Demosthenous, Andreas
    IEEE ACCESS, 2021, 9 : 24350 - 24362
  • [34] Development of DNN-based LIB State Diagnosis System Using Statistical Feature Extraction
    Seo, Donghoon
    Shin, Jongho
    Journal of Institute of Control, Robotics and Systems, 2024, 30 (07) : 755 - 762
  • [35] DNN-Based Linear Prediction Residual Enhancement for Speech Dereverberation
    Feng, Xinyang
    Li, Nuo
    He, Zunwen
    Zhang, Yan
    Zhang, Wancheng
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 541 - 545
  • [36] Investigation of DNN-Based Audio-Visual Speech Recognition
    Tamura, Satoshi
    Ninomiya, Hiroshi
    Kitaoka, Norihide
    Osuga, Shin
    Iribe, Yurie
    Takeda, Kazuya
    Hayamizu, Satoru
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2444 - 2451
  • [37] DNN-Based Estimation for Misalignment State of Automotive Radar Sensor
    Kim, Junho
    Jeong, Taewon
    Lee, Seongwook
    SENSORS, 2023, 23 (14)
  • [38] MalPatch: Evading DNN-Based Malware Detection With Adversarial Patches
    Zhan, Dazhi
    Duan, Yexin
    Hu, Yue
    Li, Weili
    Guo, Shize
    Pan, Zhisong
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 1183 - 1198
  • [39] DNN-Based Speech Enhancement via Integrating NMF and CASA
    Yan, Bofang
    Bao, Changchun
    Bai, Zhigang
    2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 435 - 439
  • [40] DNN-Based Supervised Spontaneous Court Hearing Transcription for Amharic
    Tachbelie, Martha Yifiru
    Abate, Solomon Teferra
    Aga, Rosa Tsegaye
    Mekonnen, Rahel
    Mulugeta, Hiwot
    Mulat, Abel
    Mulat, Ashenafi
    Merkebu, Solomon
    Debelee, Taye Girma
    Gachena, Worku
    PAN-AFRICAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PT I, PANAFRICON AI 2023, 2024, 2068 : 237 - 249