Sparse smoothing of articulatory features from Gaussian mixture model based acoustic-to-articulatory inversion: Benefit to speech recognition

被引:0
|
作者
Sudhakar, Prasad [1 ]
Ghosh, Prasanta Kumar [2 ]
机构
[1] Catholic Univ Louvain, ICTEAM ELEN, Louvain La Neuve, Belgium
[2] Indian Inst Sci IISc, Dept Elect Engn, Bangalore, Karnataka, India
关键词
phonetic recognition; acoustic-to-articulatory inversion; smoothing; Gaussian mixture model; sparsity; Chambolle-Pock; l(1) minimization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech recognition using articulatory features estimated using Acoustic-to-Articulatory Inversion (AAI) is considered. A recently proposed sparse smoothing approach is used to postprocess the estimates from Gaussian Mixture Model (GMM) based AAI using Minimum Mean Squared Error (MMSE) criterion. It is well known that low-pass smoothing as post-processing improves the AAI performance. Sparse smoothing, on the other hand, not only improves the AAI performance but also preserves the MMSE optimality for as many estimates as possible. In this work we investigate the benefit of preserving MMSE optimality during postprocessing by using the smoothed articulatory estimates in a broad class phonetic recognition task. Experimental results show that the low-pass filter based smoothing results in a significant drop in the recognition accuracy compared to that using articulatory estimates without any smoothing. However, the recognition accuracy obtained by articulatory features from sparse smoothing is similar to that using articulatory features directly from GMM based AAI without any post processing. Thus, sparse smoothing provides benefit both in terms of the inversion performance as well as recognition accuracy, while that is not the case with low-pass smoothing.
引用
收藏
页码:169 / 173
页数:5
相关论文
共 50 条
  • [1] A SPARSE SMOOTHING APPROACH FOR GAUSSIAN MIXTURE MODEL BASED ACOUSTIC-TO-ARTICULATORY INVERSION
    Sudhakar, Prasad
    Jacques, Laurent
    Ghosh, Prasanta Kumar
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] On smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion
    Ghosh, Prasanta K.
    Narayanan, Shrikanth S.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (02): : EL258 - EL264
  • [3] Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (04): : EL251 - EL257
  • [4] Acoustic-to-Articulatory Inversion Mapping based on Latent Trajectory Gaussian Mixture Model
    Tobing, Patrick Lumban
    Toda, Tomoki
    Kameoka, Hirokazu
    Nakamura, Satoshi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 953 - 957
  • [5] Better acoustic normalization in subject independent acoustic-to-articulatory inversion: benefit to recognition
    Afshan, Amber
    Ghosh, Prasanta Kumar
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5395 - 5399
  • [6] ACOUSTIC-TO-ARTICULATORY INVERSION BASED ON SPEECH DECOMPOSITION AND AUXILIARY FEATURE
    Wang, Jianrong
    Liu, Jinyu
    Zhao, Longxuan
    Wang, Shanyu
    Yu, Ruiguo
    Liu, Li
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4808 - 4812
  • [7] Multi-corpus Acoustic-to-articulatory Speech Inversion
    Seneviratne, Nadee
    Sivaraman, Ganesh
    Espy-Wilson, Carol
    INTERSPEECH 2019, 2019, : 859 - 863
  • [8] A Comparative Study of Articulatory Features From Facial Video and Acoustic-To-Articulatory Inversion for Phonetic Discrimination
    Narwekar, Abhishek
    Ghosh, Prasanta Kumar
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [9] The impact of cross language on acoustic-to-articulatory inversion and its influence on articulatory speech synthesis
    Illa, Aravind
    Nair, Aanish
    Ghosh, Prasanta Kumar
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8267 - 8271
  • [10] Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
    Shahrebabaki, Abdolreza Sabzi
    Salvi, Giampiero
    Svendsen, Torbjorn
    Siniscalchi, Sabato Marco
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 135 - 147