DBN based multi-stream models for speech

被引:0
|
作者
Zhang, YM [1 ]
Diao, Q [1 ]
Huang, S [1 ]
Hu, W [1 ]
Bartels, C [1 ]
Bilmes, J [1 ]
机构
[1] Intel China Res Ctr, Beijing, Peoples R China
来源
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose dynamic Bayesian network (DBN) based synchronous and asynchronous multi-stream models for noise-robust automatic speech recognition. In these models, multiple noise-robust features are combined into a single DBN to obtain better performance than any single feature system alone. Results on the Aurora 2.0 noisy speech task show significant improvements of our synchronous model over both single stream models and over a ROVER based fusion method.
引用
收藏
页码:836 / 839
页数:4
相关论文
共 50 条
  • [1] DBN based multi-stream models for audio-visual speech recognition
    Gowdy, JN
    Subramanya, A
    Bartels, C
    Bilmes, J
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 993 - 996
  • [2] DBN-based multi-stream models for Mandarin toneme recognition
    Lei, X
    Ji, G
    Ng, T
    Bilmes, J
    Ostendorf, M
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 349 - 352
  • [3] Stream fusion for multi-stream automatic speech recognition
    Sagha, Hesam
    Li, Feipeng
    Variani, Ehsan
    Millan, Jose del R.
    Chavarriaga, Ricardo
    Schuller, Bjoern
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (04) : 669 - 675
  • [4] Dimensional Emotion Driven Facial Expression Synthesis Based on the Multi-Stream DBN Model
    Wu, Hao
    Jiang, Dongmei
    Zhao, Yong
    Sahli, Hichem
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [5] Multi-stream HMM for EMG-based speech recognition
    Manabe, H
    Zhang, Z
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 4389 - 4392
  • [6] Hierarchical multi-stream posterior based speech recognition system
    Ketabdar, H
    Bourlard, H
    Bengio, S
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 294 - 306
  • [7] Multi-stream parameterization for structural speech recognition
    Asakawa, Satoshi
    Minematsu, Nobuaki
    Hirose, Keikichi
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4097 - +
  • [8] Autoencoder based multi-stream combination for noise robust speech recognition
    Mallidi, Sri Harish
    Ogawa, Tetsuji
    Vesely, Karel
    Nidadavolu, Phani S.
    Hermansky, Hynek
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3551 - 3555
  • [9] Fusion of multi-stream speech features for dialect classification
    Shweta Sinha
    Aruna Jain
    S. S. Agrawal
    CSI Transactions on ICT, 2015, 2 (4) : 243 - 252
  • [10] SUBBAND HYBRID FEATURE FOR MULTI-STREAM SPEECH RECOGNITION
    Li, Feipeng
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,