Noise-robust Hidden Markov Models for limited training data for within-species bird phrase classification

被引：0

作者：

Kaewtip, Kantapon ^{[1
]}

Taylor, Charles ^{[2
]}

Alwan, Abeer ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90024 USA

[2] Univ Calif Los Angeles, Dept Ecol & Evolutionary Biol, Los Angeles, CA USA

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

基金：

美国国家科学基金会;

关键词：

Hidden Markov Models (HMMs); limited data; noise-robust; bird phrase classification; SPEECH RECOGNITION; CONTINUOUS RECORDINGS;

D O I：

10.21437/Interspeech.2016-1360

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Hidden Markov Models (HMMs) have been studied and used extensively in speech and birdsong recognition, but they are not robust to limited training data and noise. This paper presents two novel approaches to training continuous and discrete HMMs with extremely limited data. First, the algorithm learns the global Gaussian Mixture Models (GMMs) for all training phrases available. GMM parameters are then used to initialize state parameters of each individual model. For the GMM-HMM framework, the number of states and the mixture components for each state are determined by the acoustic variation of each phrase type. The (high-energy) time-frequency prominent regions are used to compute the state emitting probability to increase noise-robustness. For the discrete HMM framework, the probability distribution of each state is initialized by the global GMMs in training. In testing, the probability of each codebook is estimated using the prominent regions of each state to increase noise-robustness. In Cassins Vireo phrase classification using 75 phrase types, the new GMM-I-IMM approach achieves 79.5% and 87% classification accuracy using I and 2 phrases, respectively, while HTK's GMM-HMM framework makes guess predictions resulting in 1.33% accuracy. The performance of the other algorithm is presented in the paper.

引用

页码：2587 / 2591

页数：5

共 26 条

[1] Automated classification of bird and amphibian calls using machine learning: A comparison of methods
Acevedo, Miguel A.
Corrada-Bravo, Carlos J.
Corrada-Bravo, Hector
Villanueva-Rivera, Luis J.
Aide, T. Mitchell
[J]. ECOLOGICAL INFORMATICS, 2009, 4 (04) : 206 - 214
[2] Template-based automatic recognition of birdsong syllables from continuous recordings
Anderson, SE
Dave, AS
Margoliash, D
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (02) : 1209 - 1219
[3] [Anonymous], 1997, The HTK book
[4] [Anonymous], 2014, Biological Reviews
[5] [Anonymous], 2011, WORKSH AUT SPEECH RE
[6] Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations and prospectus
Blumstein, Daniel T.
Mennill, Daniel J.
Clemins, Patrick
Girod, Lewis
Yao, Kung
Patricelli, Gail
Deppe, Jill L.
Krakauer, Alan H.
Clark, Christopher
Cortopassi, Kathryn A.
Hanser, Sean F.
McCowan, Brenda
Ali, Andreas M.
Kirschel, Alexander N. G.
[J]. JOURNAL OF APPLIED ECOLOGY, 2011, 48 (03) : 758 - 767
[7] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
[8] Automated sound recording and analysis techniques for bird surveys and conservation
Brandes, T. Scott
[J]. BIRD CONSERVATION INTERNATIONAL, 2008, 18 : S163 - S173
[9] Briggs F., TECHNICAL REPORT
[10] Catchpole CK., 2003, Bird song: biological themes and variations

← 1 2 3 →