Modulation analysis of speech through orthogonal FIR filterbank optimization

被引:0
作者
Le Roux, Jonathan [1 ,2 ]
Kameoka, Hirokazu [1 ]
Ono, Nobutaka [1 ]
Sagayama, Shigeki [1 ]
de Cheveigne, Alain [3 ,4 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technology, Tokyo, Japan
[2] EDITE, Paris, France
[3] Univ Paris 05, CNRS, Paris, France
[4] Ecole Normale Super, Paris, France
来源
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年
关键词
modulation spectrum; filter optimization; natural gradient; data-driven analysis;
D O I
10.1109/ICASSP.2008.4518578
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Newborns must learn to structure incoming acoustic information into segments, words, phrases, etc., before they can start to learn language. This process is thought to rely on modulation structure of the speech waveform induced by segmental or prosodic regularities within the speech heard by the infant. Here, we investigate the process by which the initial acoustic processing required by modulation analysis can itself be tuned by exposure to the regularities of speech. Starting from the classic definition of modulation, as applied within channels of the peripheral filter, we formulate a mathematical framework in which the structure of initial spectral filtering is adapted for modulation analysis. Our working hypothesis is that the human ear and brain are adapted to the analysis of modulation, via a data-driven learning process on the scale of development (or possibly evolution). Simulation results are presented and a comparison with filterbanks classically used in signal processing is done.
引用
收藏
页码:4189 / +
页数:2
相关论文
共 13 条
  • [1] Natural gradient works efficiently in learning
    Amari, S
    [J]. NEURAL COMPUTATION, 1998, 10 (02) : 251 - 276
  • [2] Discovering words in the continuous speech stream: the role of prosody
    Christophe, A
    Gout, A
    Peperkamp, S
    Morgan, J
    [J]. JOURNAL OF PHONETICS, 2003, 31 (3-4) : 585 - 598
  • [3] Modeling auditory processing of amplitude modulation .1. Detection and masking with narrow-band carriers
    Dau, T
    Kollmeier, B
    Kohlrausch, A
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 102 (05) : 2892 - 2905
  • [4] A neural circuit transforming temporal periodicity information into a rate-based representation in the mammalian auditory system
    Dicke, Ulrike
    Ewert, Stephan D.
    Dau, Torsten
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 121 (01) : 310 - 326
  • [5] Gradient adaptive paraunitary filter banks for spatio-temporal subspace analysis and multichannel blind deconvolution
    Douglas, SC
    Amari, SI
    Kung, SY
    [J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2004, 37 (2-3): : 247 - 261
  • [6] Dynamics of precise spike timing in primary auditory cortex
    Elhilali, M
    Fritz, JB
    Klein, DJ
    Simon, JZ
    Shamma, SA
    [J]. JOURNAL OF NEUROSCIENCE, 2004, 24 (05) : 1159 - 1172
  • [7] Greenberg S, 1997, INT CONF ACOUST SPEE, P1647, DOI 10.1109/ICASSP.1997.598826
  • [8] HERMANSKY H, 2003, P ASRU 2003
  • [9] Hermansky H., 2005, P INT
  • [10] Speech perception problems of the hearing impaired reflect inability to use temporal fine structure
    Lorenzi, Christian
    Gilbert, Gaetan
    Carn, Heloise
    Garnier, Stephane
    Moore, Brian C. J.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (49) : 18866 - 18869