Speech segregation based on pitch tracking and amplitude modulation

被引:60
作者
Hu, GN [1 ]
Wang, DL [1 ]
机构
[1] Ohio State Univ, Biophys Program, Columbus, OH 43210 USA
来源
PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS | 2001年
关键词
D O I
10.1109/ASPAA.2001.969547
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. Wang and Brown proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. In this paper, we extend their model by adding further processes based on psychoacoustic evidence to improve the performance. These processes include pitch tracking and grouping based on amplitude modulation (AM). Our model is systematically evaluated and compared with the Wang-Brown model, and it yields significantly better performance.
引用
收藏
页码:79 / 82
页数:4
相关论文
共 10 条
  • [1] Albert S. Bregman, 1990, AUDITORY SCENE ANAL, P411, DOI [DOI 10.1121/1.408434, DOI 10.7551/MITPRESS/1486.001.0001]
  • [2] COMPUTATIONAL AUDITORY SCENE ANALYSIS
    BROWN, GJ
    COOKE, M
    [J]. COMPUTER SPEECH AND LANGUAGE, 1994, 8 (04) : 297 - 336
  • [3] Robust automatic speech recognition with missing and unreliable acoustic data
    Cooke, M
    Green, P
    Josifovski, L
    Vizinho, A
    [J]. SPEECH COMMUNICATION, 2001, 34 (03) : 267 - 285
  • [4] COOKE M, 1993, MODELING AUDITORY PR
  • [5] ELLIS D, 1996, THESIS MIT
  • [6] BLIND SEPARATION OF SOURCES .1. AN ADAPTIVE ALGORITHM BASED ON NEUROMIMETIC ARCHITECTURE
    JUTTEN, C
    HERAULT, J
    [J]. SIGNAL PROCESSING, 1991, 24 (01) : 1 - 10
  • [7] Blind source separation of more sources than mixtures using overcomplete representations
    Lee, TW
    Lewicki, MS
    Girolami, M
    Sejnowski, TJ
    [J]. IEEE SIGNAL PROCESSING LETTERS, 1999, 6 (04) : 87 - 90
  • [8] Moore B.C.J., 1997, INTRO PSYCHOL HEARIN
  • [9] A comparison of auditory and blind separation techniques for speech segregation
    van der Kouwe, AJW
    Wang, DL
    Brown, GJ
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 189 - 195
  • [10] Separation of speech from interfering sounds based on oscillatory correlation
    Wang, DLL
    Brown, GJ
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (03): : 684 - 697