Speaker conditioned acoustic-to-articulatory inversion using x-vectors

被引:3
|
作者
Illa, Aravind [1 ]
Ghosh, Prasanta Kumar [1 ]
机构
[1] Indian Inst Sci IISc, Elect Engn, Bangalore 560012, Karnataka, India
来源
关键词
acoustic-to-articulatory inversion; BLSTM; x-vectors; SPEECH;
D O I
10.21437/Interspeech.2020-1222
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech production involves the movement of various articulators, including tongue, jaw, and lips. Estimating the movement of the articulators from the acoustics of speech is known as acoustic-to-articulatory inversion (AAI). Recently, it has been shown that instead of training AAI in a speaker specific manner, pooling the acoustic-articulatory data from multiple speakers is beneficial. Further, additional conditioning with speaker specific information by one-hot encoding at the input of AAI along with acoustic features benefits the AAI performance in a closed-set speaker train and test condition. In this work, we carry out an experimental study on the benefit of using x-vectors for providing speaker specific information to condition AAI. Experiments with 30 speakers have shown that the AAI performance benefits from the use of x-vectors in a closed set seen speaker condition. Further, x-vectors also generalizes well for unseen speaker evaluation.
引用
收藏
页码:1376 / 1380
页数:5
相关论文
共 50 条
  • [11] Incorporation of phonetic constraints in acoustic-to-articulatory inversion
    Potard, Blaise
    Laprie, Yves
    Ouni, Slim
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (04): : 2310 - 2323
  • [12] Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model
    Hiroya, Sadao
    Honda, Masaaki
    IEICE Transactions on Information and Systems, 2004, E87-D (05) : 1071 - 1078
  • [13] Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model
    Hiroya, S
    Honda, M
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1071 - 1078
  • [14] SPEAKER RECOGNITION FOR MULTI-SPEAKER CONVERSATIONS USING X-VECTORS
    Snyder, David
    Garcia-Romero, Daniel
    Sell, Gregory
    McCree, Alan
    Povey, Daniel
    Khudanpur, Sanjeev
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5796 - 5800
  • [15] A SUBJECT-INDEPENDENT ACOUSTIC-TO-ARTICULATORY INVERSION
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth S.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4624 - 4627
  • [16] A DEEP RECURRENT APPROACH FOR ACOUSTIC-TO-ARTICULATORY INVERSION
    Liu, Peng
    Yu, Quanjie
    Wu, Zhiyong
    Kang, Shiyin
    Meng, Helen
    Cai, Lainhong
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4450 - 4454
  • [17] Acoustic-to-Articulatory Inversion based on Local Regression
    Al Moubayed, Samer
    Ananthakrishnan, G.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 937 - 940
  • [18] A generalized smoothness criterion for acoustic-to-articulatory inversion
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 128 (04): : 2162 - 2172
  • [19] Acoustic-to-Articulatory Inversion with Deep Autoregressive Articulatory-WaveNet
    Bozorg, Narjes
    Johnson, Michael T.
    INTERSPEECH 2020, 2020, : 3725 - 3729
  • [20] PERFORMANCES OF UNSUPERVISED HMM IN ACOUSTIC-TO-ARTICULATORY INVERSION
    Lachambre, Helene
    Koenig, Lionel
    Andre-Obrecht, Regine
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7140 - 7144