Unsupervised Acoustic-to-Articulatory Inversion with Variable Vocal Tract Anatomy

被引:2
|
作者
Sun, Yifan [1 ]
Huang, Qinlong
Wu, Xihong
机构
[1] Peking Univ, Dept Machine Intelligence, Speech & Hearing Res Ctr, Beijing, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
acoustic-to-articulatory inversion; vocal tract anatomy; ADAPTATION;
D O I
10.21437/Interspeech.2022-477
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Acoustic and articulatory variability across speakers has always limited the generalization performance of acoustic-to-articulatory inversion (AAI) methods. Speaker-independent AAI (SI-AAI) methods generally focus on the transformation of acoustic features, but rarely consider the direct matching in the articulatory space. Unsupervised AAI methods have the potential of better generalization ability but typically use a fixed morphological setting of a physical articulatory synthesizer even for different speakers, which may cause nonnegligible articulatory compensation. In this paper, we propose to jointly estimate articulatory movements and vocal tract anatomy during the inversion of speech. An unsupervised AAI framework is employed, where estimated vocal tract anatomy is used to set the configuration of a physical articulatory synthesizer, which in turn is driven by estimated articulation movements to imitate a given speech. Experiments show that the estimation of vocal tract anatomy can bring both acoustic and articulatory benefits. Acoustically, the reconstruction quality is higher; articulatorily, the estimated articulatory movement trajectories better match the measured ones. Moreover, the estimated anatomy parameters show clear clusterings by speakers, indicating successful decoupling of speaker characteristics and linguistic content.
引用
收藏
页码:4656 / 4660
页数:5
相关论文
共 50 条
  • [21] Information theoretic acoustic feature selection for acoustic-to-articulatory inversion
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth S.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3176 - 3180
  • [22] Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion
    Ouni, S
    Laprie, Y
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 118 (01): : 444 - 460
  • [23] Improved subject-independent acoustic-to-articulatory inversion
    National Institute of Technology, Karnataka , Mangalore
    575025, India
    不详
    560012, India
    Speech Commun, (1-16):
  • [24] Acoustic-to-Articulatory Inversion Using Particle Swarm Optimization
    Fairee, Suthida
    Sirinaovakul, Booncharoen
    Prom-on, Santitham
    2015 12TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2015,
  • [25] Is average RMSE appropriate for evaluating acoustic-to-articulatory inversion?
    Fang, Qiang
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 997 - 1003
  • [26] Acoustic-to-articulatory inversion from infants' vowel vocalizations
    Oohashi, Hiroki
    Watanabe, Hama
    Taga, Gentaro
    NEUROSCIENCE RESEARCH, 2011, 71 : E286 - E286
  • [27] Improved subject-independent acoustic-to-articulatory inversion
    Afshan, Amber
    Ghosh, Prasanta Kumar
    SPEECH COMMUNICATION, 2015, 66 : 1 - 16
  • [28] Multi-corpus Acoustic-to-articulatory Speech Inversion
    Seneviratne, Nadee
    Sivaraman, Ganesh
    Espy-Wilson, Carol
    INTERSPEECH 2019, 2019, : 859 - 863
  • [29] Acoustic-to-articulatory mapping codebook constraint for determining vocal-tract length for inverse speech problem and articulatory synthesis
    Yu, ZL
    Zeng, SC
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 827 - 830
  • [30] Autoregressive Articulatory WaveNet Flow for Speaker-Independent Acoustic-to-Articulatory Inversion
    Bozorg, Narjes
    Johnson, Michael T.
    Soleymanpour, Mohammad
    2021 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2021, : 156 - 161