Bird Call Classification Using DNN-Based Acoustic Modelling

被引：4

作者：

Rajan, Rajeev ^{[1
,2
]}

Johnson, Jisna ^{[1
,2
]}

Kareem, Noumida Abdul ^{[1
,2
]}

机构：

[1] Coll Engn, Dept Elect & Commun Engn, Thiruvananthapuram, Kerala, India

[2] APJ Abdul Kalam Technol Univ, Thiruvananthapuram, Kerala, India

来源：

CIRCUITS SYSTEMS AND SIGNAL PROCESSING | 2022年 / 41卷 / 05期

关键词：

Hidden Markov model; Gaussian mixture model; Deep neural network; Convolutional neural network;

D O I：

10.1007/s00034-021-01896-2

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Bird call recognition using deep neural network-hidden Markov model (DNN-HMM)-based transcription is proposed. The work is an attempt to adapt the human speech recognition framework for bird call classification through transcription approach. Initially, the phone transcriptions are generated using CMU-Sphinx, and lexicons are modified using group delay-based segmentation. Later, bird call transcription is implemented using hybrid DNN-HMM framework through DNN-based acoustic modelling. During the DNN-based acoustic modelling, mel-frequency cepstral coefficient features (MFCCs) are computed and experimented with monophone models, triphone models, followed by linear discriminative analysis and maximum likelihood linear transform. The transcribed phonemes are corrected using context-based rules in the final phase. The proposed approach is evaluated on a dataset that consists of ten species with 563 audio tracks. The hybrid DNN-HMM approach outperforms the convolutional neural network and long short-term memory framework with an accuracy of 94.46%.

引用

页码：2669 / 2680

页数：12

共 50 条

[31] Towards breaking DNN-based audio steganalysis with GAN
Wang, Jie
Wang, Rangding
Dong, Li
Yan, Diqun
Zhang, Xueyuan
Lin, Yuzhen
INTERNATIONAL JOURNAL OF AUTONOMOUS AND ADAPTIVE COMMUNICATIONS SYSTEMS, 2021, 14 (04) : 371 - 383
[32] A study of speaker adaptation for DNN-based speech synthesis
Wu, Zhizheng
Swietojanski, Pawel
Veaux, Christophe
Renals, Steve
King, Simon
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 879 - 883
[33] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
Abdullah, Salinna
Zamani, Majid
Demosthenous, Andreas
IEEE ACCESS, 2021, 9 : 24350 - 24362
[34] Development of DNN-based LIB State Diagnosis System Using Statistical Feature Extraction
Seo, Donghoon
Shin, Jongho
Journal of Institute of Control, Robotics and Systems, 2024, 30 (07) : 755 - 762
[35] DNN-Based Linear Prediction Residual Enhancement for Speech Dereverberation
Feng, Xinyang
Li, Nuo
He, Zunwen
Zhang, Yan
Zhang, Wancheng
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 541 - 545
[36] Investigation of DNN-Based Audio-Visual Speech Recognition
Tamura, Satoshi
Ninomiya, Hiroshi
Kitaoka, Norihide
Osuga, Shin
Iribe, Yurie
Takeda, Kazuya
Hayamizu, Satoru
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2444 - 2451
[37] DNN-Based Estimation for Misalignment State of Automotive Radar Sensor
Kim, Junho
Jeong, Taewon
Lee, Seongwook
SENSORS, 2023, 23 (14)
[38] MalPatch: Evading DNN-Based Malware Detection With Adversarial Patches
Zhan, Dazhi
Duan, Yexin
Hu, Yue
Li, Weili
Guo, Shize
Pan, Zhisong
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 1183 - 1198
[39] DNN-Based Speech Enhancement via Integrating NMF and CASA
Yan, Bofang
Bao, Changchun
Bai, Zhigang
2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 435 - 439
[40] DNN-Based Supervised Spontaneous Court Hearing Transcription for Amharic
Tachbelie, Martha Yifiru
Abate, Solomon Teferra
Aga, Rosa Tsegaye
Mekonnen, Rahel
Mulugeta, Hiwot
Mulat, Abel
Mulat, Ashenafi
Merkebu, Solomon
Debelee, Taye Girma
Gachena, Worku
PAN-AFRICAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PT I, PANAFRICON AI 2023, 2024, 2068 : 237 - 249

← 1 2 3 4 5 →