Bird Call Classification Using DNN-Based Acoustic Modelling

被引：3

作者：

Rajan, Rajeev ^{[1
,2
]}

Johnson, Jisna ^{[1
,2
]}

Kareem, Noumida Abdul ^{[1
,2
]}

机构：

[1] Coll Engn, Dept Elect & Commun Engn, Thiruvananthapuram, Kerala, India

[2] APJ Abdul Kalam Technol Univ, Thiruvananthapuram, Kerala, India

来源：

CIRCUITS SYSTEMS AND SIGNAL PROCESSING | 2022年 / 41卷 / 05期

关键词：

Hidden Markov model; Gaussian mixture model; Deep neural network; Convolutional neural network;

D O I：

10.1007/s00034-021-01896-2

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Bird call recognition using deep neural network-hidden Markov model (DNN-HMM)-based transcription is proposed. The work is an attempt to adapt the human speech recognition framework for bird call classification through transcription approach. Initially, the phone transcriptions are generated using CMU-Sphinx, and lexicons are modified using group delay-based segmentation. Later, bird call transcription is implemented using hybrid DNN-HMM framework through DNN-based acoustic modelling. During the DNN-based acoustic modelling, mel-frequency cepstral coefficient features (MFCCs) are computed and experimented with monophone models, triphone models, followed by linear discriminative analysis and maximum likelihood linear transform. The transcribed phonemes are corrected using context-based rules in the final phase. The proposed approach is evaluated on a dataset that consists of ten species with 563 audio tracks. The hybrid DNN-HMM approach outperforms the convolutional neural network and long short-term memory framework with an accuracy of 94.46%.

引用

页码：2669 / 2680

页数：12

共 50 条

[21] DNN-BASED WIRELESS POSITIONING IN AN OUTDOOR ENVIRONMENT
Lee, Jin-Young
Eom, Chahyeon
Kwak, Youngsu
Kang, Hong-Goo
Lee, Chungyong
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 3799 - 3803
[22] DNN-based Intelligent Beamforming on a Programmable Metasurface
Li S.
Fu S.
Xu F.
Journal of Radars, 2021, 10 (02) : 259 - 266
[23] DNN based Acoustic Scene Classification using Score Fusion of MFCC and Inverse MFCC
Paseddula, Chandrasekhar
Gangashetty, Suryakanth V.
2018 IEEE 13TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (IEEE ICIIS), 2018, : 31 - 34
[24] DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope
Koguchi, Junya
Takamichi, Shinnosuke
Morise, Masanori
Saruwatari, Hiroshi
Sagayama, Shigeki
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (12) : 2673 - 2681
[25] ON USING HETEROGENEOUS DATA FOR VEHICLE-BASED SPEECH RECOGNITION: A DNN-BASED APPROACH
Feng, Xue
Richardson, Brigitte
Amman, Scott
Glass, James
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4385 - 4389
[26] DNN-Based Unit Selection Using Frame-Sized Speech Segments
Zhou, Zhi-Ping
Ling, Zhen-Hua
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[27] DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus
Yamashita, Yuki
Koriyama, Tomoki
Saito, Yuki
Takamichi, Shinnosuke
Ijima, Yusuke
Masumura, Ryo
Saruwatari, Hiroshi
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6438 - 6443
[28] DNN-based anomaly prediction for the uncertainty in visual SLAM
Bosdelekidis, Vasileios
Johansen, Tor A.
Sokolova, Nadezda
2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 684 - 691
[29] Model integration for HMM- and DNN-based speech synthesis using Product-of-Experts framework
Tachibana, Kentaro
Toda, Tomoki
Shiga, Yoshinori
Kawai, Hisashi
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2288 - 2292
[30] Towards breaking DNN-based audio steganalysis with GAN
Wang, Jie
Wang, Rangding
Dong, Li
Yan, Diqun
Zhang, Xueyuan
Lin, Yuzhen
INTERNATIONAL JOURNAL OF AUTONOMOUS AND ADAPTIVE COMMUNICATIONS SYSTEMS, 2021, 14 (04) : 371 - 383

← 1 2 3 4 5 →