Blind source separation-based IVA-Xception model for bird sound recognition in complex acoustic environments

被引：13

作者：

Dai, Yusheng ^{[1
]}

Yang, Jin ^{[1
]}

Dong, Yiwei ^{[2
]}

Zou, Haipeng ^{[3
]}

Hu, Mingzhi ^{[1
]}

Wang, Bin ^{[4
]}

机构：

[1] Sichuan Univ, Sch Cyber Sci & Engn, 24 South Sect 1,Yihuan Rd, Chengdu 610065, Peoples R China

[2] Sichuan Univ, Coll Math, Chengdu, Peoples R China

[3] Sichuan Univ, Coll Software Engn, Chengdu, Peoples R China

[4] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu, Peoples R China

来源：

ELECTRONICS LETTERS | 2021年 / 57卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Biology and medical computing; Computer vision and image processing techniques; Digital signal processing; Other topics in statistics; Signal processing and detection; Speech and audio signal processing;

D O I：

10.1049/ell2.12160

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Identification of bird species from audio recordings has been a major area of interest within the field of ecological surveillance and biodiversity conservation. Previous studies have successfully identified bird species from given recordings. However, most of these studies are only adaptive to low-noise acoustic environments and the cases where each recording contains only one bird's sound simultaneously. In reality, bird audios recorded in the wild often contain overlapping signals, such as bird dawn chorus, which makes audio feature extraction and accurate classification extremely difficult. This study is the first to focus on applying a blind source separation method to identify all foreground bird species contained in overlapping vocalization recordings. The proposed IVA-Xception model is based on independent vector analysis and convolutional neural network. Experiments on 2020 Bird Sound Recognition in Complex Acoustic Environments competition (BirdCLEF2020) dataset show that this model could achieve a higher macro F1-score and average accuracy compared with state-of-the-art methods.

引用

页码：454 / 456

页数：3

共 10 条

[1] Automatic Bird Sound Source Separation Based on Passive Acoustic Devices in Wild Environment
Xie, Jiangjian
Shi, Yuwei
Ni, Dongming
Milling, Manuel
Liu, Shuo
Zhang, Junguo
Qian, Kun
Schuller, Bjorn W.
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (09): : 16604 - 16617
[2] Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation
Nikunen, Joonas
Virtanen, Tuomas
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (03) : 727 - 739
[3] Spiny Lobster Sound Identification Based on Blind Source Separation (BSS) for Passive Acoustic Monitoring (PAM)
Hadi, Fatin Izzati M. A.
Ramli, Dzati Athiar
Hassan, Norsalina
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 4493 - 4502
[4] Two Model-Based EM Algorithms for Blind Source Separation in Noisy Environments
Schwartz, Boaz
Gannot, Sharon
Habets, Emanuel A. P.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2209 - 2222
[5] Blind Source Separation of Acoustic Signals in Realistic Environments Based on ICA in the Time-Frequency Domain
Ding, Shuxue
Cichocki, Andrzej
Huang, Jie
Wei, Daming
INTERNATIONAL JOURNAL OF PERVASIVE COMPUTING AND COMMUNICATIONS, 2005, 1 (02) : 89 - 100
[6] Online Adaptation of Fourier Series Based Acoustic Transfer Function Model to Improve Sound Source Localization and Separation
Sudo, Yui
Takigahira, Masayuki
Tsuru, Hideo
Nakadai, Kazuhiro
Nakajima, Hirofumi
2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, : 2058 - 2063
[7] Binaural Sound Source Localization Based on Generalized Parametric Model and Two-Layer Matching Strategy in Complex Environments
Liu, Hong
Pang, Cheng
Zhang, Jie
2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, : 4496 - 4503
[8] Online adaptation of fourier series-based acoustic transfer function model and its application to sound source localization and separation
Sudo, Yui
Takigahira, Masayuki
Tsuru, Hideo
Nakadai, Kazuhiro
Nakajima, Hirofumi
ADVANCED ROBOTICS, 2024, 38 (19-20) : 1351 - 1363
[9] Acoustic Model Combination Incorporated With Mask-Based Multi-Channel Source Separation for Automatic Speech Recognition
Yoon, Jae Sam
Park, Ji Hun
Kim, Hong Kook
Kim, Hoirin
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (05) : 772 - 784
[10] Noise-robust hands-free speech recognition using SIMO-model-based blind source separation
Mori, Y.
Takatani, T.
Saruwatari, H.
Shikano, K.
Hiekata, T.
Morita, T.
2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 1290 - +

← 1 →