Speaker conditioned acoustic-to-articulatory inversion using x-vectors

被引:3
|
作者
Illa, Aravind [1 ]
Ghosh, Prasanta Kumar [1 ]
机构
[1] Indian Inst Sci IISc, Elect Engn, Bangalore 560012, Karnataka, India
来源
关键词
acoustic-to-articulatory inversion; BLSTM; x-vectors; SPEECH;
D O I
10.21437/Interspeech.2020-1222
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech production involves the movement of various articulators, including tongue, jaw, and lips. Estimating the movement of the articulators from the acoustics of speech is known as acoustic-to-articulatory inversion (AAI). Recently, it has been shown that instead of training AAI in a speaker specific manner, pooling the acoustic-articulatory data from multiple speakers is beneficial. Further, additional conditioning with speaker specific information by one-hot encoding at the input of AAI along with acoustic features benefits the AAI performance in a closed-set speaker train and test condition. In this work, we carry out an experimental study on the benefit of using x-vectors for providing speaker specific information to condition AAI. Experiments with 30 speakers have shown that the AAI performance benefits from the use of x-vectors in a closed set seen speaker condition. Further, x-vectors also generalizes well for unseen speaker evaluation.
引用
收藏
页码:1376 / 1380
页数:5
相关论文
共 50 条
  • [1] Autoregressive Articulatory WaveNet Flow for Speaker-Independent Acoustic-to-Articulatory Inversion
    Bozorg, Narjes
    Johnson, Michael T.
    Soleymanpour, Mohammad
    2021 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2021, : 156 - 161
  • [2] ACOUSTIC-TO-ARTICULATORY INVERSION USING AN EPISODIC MEMORY
    Demange, S.
    Ouni, S.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4620 - 4623
  • [3] Acoustic-to-Articulatory Inversion Using Particle Swarm Optimization
    Fairee, Suthida
    Sirinaovakul, Booncharoen
    Prom-on, Santitham
    2015 12TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2015,
  • [4] Jerk Minimization for Acoustic-To-Articulatory Inversion
    Rajpal, Avni
    Patil, Hemant A.
    9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, : 82 - 87
  • [5] Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion
    Ouni, S
    Laprie, Y
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 118 (01): : 444 - 460
  • [6] Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion
    Ji, An
    Johnson, Michael T.
    Berry, Jeffrey J.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (10) : 1865 - 1875
  • [7] Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract
    Csapo, Tamas Gabor
    INTERSPEECH 2020, 2020, : 3720 - 3724
  • [8] Vocal tract length normalization for speaker independent acoustic-to-articulatory speech inversion
    Sivaraman, Ganesh
    Mitra, Vikramjit
    Nam, Hosung
    Tiede, Mark
    Espy-Wilson, Carol
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 455 - 459
  • [9] Formant Trajectories for Acoustic-to-Articulatory Inversion
    Ozbek, I. Yuecel
    Hasegawa-Johnson, Mark
    Demirekler, Muebeccel
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2783 - +
  • [10] Closed-set speaker conditioned acoustic-to-articulatory inversion using bi-directional long short term memory network
    Illa, Aravind
    Ghosh, Prasanta Kumar
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 147 (02): : EL171 - EL176