Speaker conditioned acoustic-to-articulatory inversion using x-vectors

被引:3
|
作者
Illa, Aravind [1 ]
Ghosh, Prasanta Kumar [1 ]
机构
[1] Indian Inst Sci IISc, Elect Engn, Bangalore 560012, Karnataka, India
来源
关键词
acoustic-to-articulatory inversion; BLSTM; x-vectors; SPEECH;
D O I
10.21437/Interspeech.2020-1222
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech production involves the movement of various articulators, including tongue, jaw, and lips. Estimating the movement of the articulators from the acoustics of speech is known as acoustic-to-articulatory inversion (AAI). Recently, it has been shown that instead of training AAI in a speaker specific manner, pooling the acoustic-articulatory data from multiple speakers is beneficial. Further, additional conditioning with speaker specific information by one-hot encoding at the input of AAI along with acoustic features benefits the AAI performance in a closed-set speaker train and test condition. In this work, we carry out an experimental study on the benefit of using x-vectors for providing speaker specific information to condition AAI. Experiments with 30 speakers have shown that the AAI performance benefits from the use of x-vectors in a closed set seen speaker condition. Further, x-vectors also generalizes well for unseen speaker evaluation.
引用
收藏
页码:1376 / 1380
页数:5
相关论文
共 50 条
  • [21] The impact of speaking rate on acoustic-to-articulatory inversion
    Illa, Aravind
    Ghosh, Prasanta Kumar
    COMPUTER SPEECH AND LANGUAGE, 2020, 59 (75-90): : 75 - 90
  • [22] Incorporation of phonetic constraints in acoustic-to-articulatory inversion
    Potard, Blaise
    Laprie, Yves
    Ouni, Slim
    Journal of the Acoustical Society of America, 2008, 123 (04): : 2310 - 2323
  • [23] A study of emotional information present in articulatory movements estimated using acoustic-to-articulatory inversion
    Kim, Jangwon
    Ghosh, Prasanta
    Lee, Sungbok
    Narayanan, Shrikanth S.
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [24] Cross-speaker Acoustic-to-Articulatory Inversion using Phone-based Trajectory HAM for Pronunciation Training
    Hueber, Thomas
    Ben-Youssef, Atef
    Bailly, Gerard
    Badin, Pierre
    Elisei, Frederic
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 782 - 785
  • [25] ACOUSTIC-TO-ARTICULATORY INVERSION FOR DYSARTHRIC SPEECH BY USING CROSS-CORPUS ACOUSTIC-ARTICULATORY DATA
    Maharana, Sarthak Kumar
    Illa, Aravind
    Mannem, Renuka
    Belur, Yamini
    Shetty, Preetie
    Kumar, Veeramani Preethish
    Vengalil, Seena
    Polavarapu, Kiran
    Atchayaram, Nalini
    Ghosh, Prasanta Kumar
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6458 - 6462
  • [26] Information theoretic acoustic feature selection for acoustic-to-articulatory inversion
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth S.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3176 - 3180
  • [27] REPRESENTATION LEARNING USING CONVOLUTION NEURAL NETWORK FOR ACOUSTIC-TO-ARTICULATORY INVERSION
    Illa, Aravind
    Ghosh, Prasanta Kumar
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5931 - 5935
  • [28] Improved subject-independent acoustic-to-articulatory inversion
    National Institute of Technology, Karnataka , Mangalore
    575025, India
    不详
    560012, India
    Speech Commun, (1-16):
  • [29] Is average RMSE appropriate for evaluating acoustic-to-articulatory inversion?
    Fang, Qiang
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 997 - 1003
  • [30] Acoustic-to-articulatory inversion from infants' vowel vocalizations
    Oohashi, Hiroki
    Watanabe, Hama
    Taga, Gentaro
    NEUROSCIENCE RESEARCH, 2011, 71 : E286 - E286