A silent speech system based on permanent magnet articulography and direct synthesis

被引:40
|
作者
Gonzalez, Jose A. [1 ]
Cheah, Lam A. [2 ]
Gilbert, James M. [2 ]
Bai, Jie [2 ]
Ell, Stephen R. [3 ]
Green, Phil D. [1 ]
Moore, Roger K. [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S10 2TN, S Yorkshire, England
[2] Univ Hull, Sch Engn, Kingston Upon Hull, Yorks, England
[3] Hull & East Yorkshire Hosp Trust, Castle Hill Hosp, Cottingham, England
基金
美国国家卫生研究院;
关键词
Silent speech interfaces; Speech rehabilitation; Speech synthesis; Permanent magnet articulography; Augmentative and alternative communication; MAXIMUM-LIKELIHOOD-ESTIMATION; VOICE CONVERSION; VOCAL-TRACT; RECOGNITION; EXTRACTION;
D O I
10.1016/j.csl.2016.02.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a silent speech interface (SSI) system aimed at restoring speech communication for individuals who have lost their voice due to laryngectomy or diseases affecting the vocal folds. In the proposed system, articulatory data captured from the lips and tongue using permanent magnet articulography (PMA) are converted into audible speech using a speaker-dependent transformation learned from simultaneous recordings of PMA and audio signals acquired before laryngectomy. The transformation is represented using a mixture of factor analysers, which is a generative model that allows us to efficiently model non-linear behaviour and perform dimensionality reduction at the same time. The learned transformation is then deployed during normal usage of the SSI to restore the acoustic speech signal associated with the captured PMA data. The proposed system is evaluated using objective quality measures and listening tests on two databases containing PMA and audio recordings for normal speakers. Results show that it is possible to reconstruct speech from articulator movements captured by an unobtrusive technique without an intermediate recognition step. The SSI is capable of producing speech of sufficient intelligibility and naturalness that the speaker is clearly identifiable, but problems remain in scaling up the process to function consistently for phonetically rich vocabularies. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:67 / 87
页数:21
相关论文
共 50 条
  • [21] HMM-based Tibetan Lhasa Speech Synthesis System
    Wu Zhiqiang
    Yu Hongzhi
    Li Guanyu
    Wan Shuhui
    2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 92 - 95
  • [22] MagTrack: A Wearable Tongue Motion Tracking System for Silent Speech Interfaces
    Cao, Beiming
    Ravi, Shravan
    Sebkhi, Nordine
    Bhavsar, Arpan
    Inan, Omer T.
    Xu, Wen
    Wang, Jun
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2023, 66 (08): : 3206 - 3221
  • [23] Speech synthesis system and handicap
    Moudenc, T
    Emerard, F
    ANNALES DES TELECOMMUNICATIONS-ANNALS OF TELECOMMUNICATIONS, 2003, 58 (5-6): : 928 - 934
  • [24] Small-vocabulary speech recognition using a silent speech interface based on magnetic sensing
    Hofe, Robin
    Ell, Stephen R.
    Fagan, Michael J.
    Gilbert, James M.
    Green, Phil D.
    Moore, Roger K.
    Rybchenko, Sergey I.
    SPEECH COMMUNICATION, 2013, 55 (01) : 22 - 32
  • [25] Implementation and Evaluation of an HMM-based Thai Speech Synthesis System
    Chomphan, Suphattharachai
    Kobayashi, Takao
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 173 - 176
  • [26] Continuous synthesis of artificial speech sounds from human cortical surface recordings during silent speech production
    Meng, Kevin
    Goodarzy, Farhad
    Kim, EuiYoung
    Park, Ye Jin
    Kim, June Sic
    Cook, Mark J.
    Chung, Chun Kee
    Grayden, David B.
    JOURNAL OF NEURAL ENGINEERING, 2023, 20 (04)
  • [27] An Investigation of Implementation and Performance Analysis of DNN Based Speech Synthesis System
    Chen, Zhehuai
    Yu, Kai
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 577 - 582
  • [28] Optical Character Recognition Based Speech Synthesis System Using LabVIEW
    Singla, S. K.
    Yadav, R. K.
    JOURNAL OF APPLIED RESEARCH AND TECHNOLOGY, 2014, 12 (05) : 919 - 926
  • [29] Implementation and evaluation of an HMM-based Korean speech synthesis system
    Kim, SJ
    Kim, JJ
    Hahn, M
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 1116 - 1119
  • [30] Eigentongue feature extraction for an ultrasound-based silent speech interface
    Hueber, T.
    Aversano, G.
    Chollet, G.
    Denby, B.
    Dreyfus, G.
    Oussar, Y.
    Roussel, P.
    Stone, M.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 1245 - +