Nonnegative features of spectro-temporal sounds for classification

被引:47
|
作者
Cho, YC [1 ]
Choi, SJ [1 ]
机构
[1] Pohang Univ Sci & Technol, Dept Comp Sci, Pohang 790784, South Korea
关键词
acoustic feature extraction; general sound recognition; nonnegative matrix factorization;
D O I
10.1016/j.patrec.2004.11.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A parts-based representation is a way of understanding object recognition in the brain. The nonnegative matrix factorization (NMF) is an algorithm which is able to learn a parts-based representation by allowing only non-subtractive combinations [Lee, D.D., Seung, H.S., 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788-791]. In this paper we incorporate a parts-based representation of spectro-temporal sounds into the acoustic feature extraction, which leads to nonnegative features. We present a method of inferring encoding variables in the framework of NMF and show that the method produces robust acoustic features in the presence of noise in the task of general sound classification.. Experimental results confirm that the proposed feature extraction method improves the classification performance, especially in the presence of noise, compared to independent component analysis (ICA) which produces holistic features. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:1327 / 1336
页数:10
相关论文
共 50 条
  • [31] Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
    Esfandian, N.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2020, 33 (01): : 105 - 111
  • [32] Spectro-Temporal Weighting of Loudness
    Oberfeld, Daniel
    Heeren, Wiebke
    Rennies, Jan
    Verhey, Jesko
    PLOS ONE, 2012, 7 (11):
  • [33] ROBUST SPECTRO-TEMPORAL FEATURES BASED ON AUTOREGRESSIVE MODELS OF HILBERT ENVELOPES
    Ganapathy, Sriram
    Thomas, Samuel
    Hermansky, Hynek
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4286 - 4289
  • [34] Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR
    Ravuri, Suman V.
    Morgan, Nelson
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1181 - 1184
  • [35] DRUM EXTRACTION FROM POLYPHONIC MUSIC BASED ON A SPECTRO-TEMPORAL MODEL OF PERCUSSIVE SOUNDS
    Rigaud, Francois
    Lagrange, Mathieu
    Roebel, Axel
    Peeters, Geoffroy
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 381 - 384
  • [36] Multi-Stream Spectro-Temporal Features for Robust Speech Recognition
    Zhao, Sherry Y.
    Morgan, Nelson
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 898 - 901
  • [37] Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling
    Karjigi, V.
    Rao, P.
    SPEECH COMMUNICATION, 2012, 54 (10) : 1104 - 1120
  • [38] Robust Dialect Identification System using Spectro-Temporal Gabor Features
    Chittaragi, Nagaratna B.
    Mothukuri, Siva Krishna P.
    Hegde, Pradyoth
    Koolagudi, Shashidhar G.
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1589 - 1594
  • [39] Auditory abstraction from spectro-temporal features to coding auditory entities
    Chechik, Gal
    Nelken, Israel
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (46) : 18968 - 18973
  • [40] SPECTRAL VS. SPECTRO-TEMPORAL FEATURES FOR ACOUSTIC EVENT DETECTION
    Cotton, Courtenay V.
    Ellis, Daniel P. W.
    2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2011, : 69 - 72