A silent speech system based on permanent magnet articulography and direct synthesis

被引:40
作者
Gonzalez, Jose A. [1 ]
Cheah, Lam A. [2 ]
Gilbert, James M. [2 ]
Bai, Jie [2 ]
Ell, Stephen R. [3 ]
Green, Phil D. [1 ]
Moore, Roger K. [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S10 2TN, S Yorkshire, England
[2] Univ Hull, Sch Engn, Kingston Upon Hull, Yorks, England
[3] Hull & East Yorkshire Hosp Trust, Castle Hill Hosp, Cottingham, England
基金
美国国家卫生研究院;
关键词
Silent speech interfaces; Speech rehabilitation; Speech synthesis; Permanent magnet articulography; Augmentative and alternative communication; MAXIMUM-LIKELIHOOD-ESTIMATION; VOICE CONVERSION; VOCAL-TRACT; RECOGNITION; EXTRACTION;
D O I
10.1016/j.csl.2016.02.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a silent speech interface (SSI) system aimed at restoring speech communication for individuals who have lost their voice due to laryngectomy or diseases affecting the vocal folds. In the proposed system, articulatory data captured from the lips and tongue using permanent magnet articulography (PMA) are converted into audible speech using a speaker-dependent transformation learned from simultaneous recordings of PMA and audio signals acquired before laryngectomy. The transformation is represented using a mixture of factor analysers, which is a generative model that allows us to efficiently model non-linear behaviour and perform dimensionality reduction at the same time. The learned transformation is then deployed during normal usage of the SSI to restore the acoustic speech signal associated with the captured PMA data. The proposed system is evaluated using objective quality measures and listening tests on two databases containing PMA and audio recordings for normal speakers. Results show that it is possible to reconstruct speech from articulator movements captured by an unobtrusive technique without an intermediate recognition step. The SSI is capable of producing speech of sufficient intelligibility and naturalness that the speaker is clearly identifiable, but problems remain in scaling up the process to function consistently for phonetically rich vocabularies. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:67 / 87
页数:21
相关论文
共 50 条
  • [31] Spotting words in silent speech videos: a retrieval-based approach
    Jha, Abhishek
    Namboodiri, Vinay P.
    Jawahar, C. V.
    MACHINE VISION AND APPLICATIONS, 2019, 30 (02) : 217 - 229
  • [32] Ultrasonic Doppler Based Silent Speech Interface Using Perceptual Distance
    Lee, Ki-Seung
    APPLIED SCIENCES-BASEL, 2022, 12 (02):
  • [33] Eigentongue feature extraction for an ultrasound-based silent speech interface
    Hueber, T.
    Aversano, G.
    Chollet, G.
    Denby, B.
    Dreyfus, G.
    Oussar, Y.
    Roussel, P.
    Stone, M.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 1245 - +
  • [34] Artifact Removal Algorithm for an EMG-based Silent Speech Interface
    Wand, Michael
    Himmelsbach, Adam
    Heistermann, Till
    Janke, Matthias
    Schultz, Tanja
    2013 35TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2013, : 5750 - 5753
  • [35] Preliminary Test of a Wireless Magnetic Tongue Tracking System for Silent Speech Interface
    Kim, Myungjong
    Sebkhi, Nordine
    Cao, Beiming
    Ghovanloo, Maysam
    Wang, Jun
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 13 - 16
  • [36] Analysis of Wind Driven Permanent Magnet Synchronous Generator for Power Grid System
    Raj, R. Essaki
    Vasudevan, N.
    Srinivasan, S.
    Abirami, T.
    Pandian, R.
    Gnanavel, C.
    Parkunam, N.
    Srinivasan, R.
    Hurissa, Dejene
    INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, 2025, 2025 (01):
  • [37] AN ANALYSIS OF MACHINE TRANSLATION AND SPEECH SYNTHESIS IN SPEECH-TO-SPEECH TRANSLATION SYSTEM
    Hashimoto, Kei
    Yamagishi, Junichi
    Byrne, William
    King, Simon
    Tokuda, Keiichi
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5108 - 5111
  • [38] Implementation of Telugu Speech Synthesis System
    Ramya, Gangala
    Naik, Nenavath Srinivas
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1151 - 1154
  • [39] Secure Speech Encryption System Using Segments for Speech Synthesis
    Kohata, Minoru
    2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 264 - 267
  • [40] A Novel Modified Fuzzy-predictive Control of Permanent Magnet Synchronous Generator Based Wind Energy Conversion System
    Akbari, Ehsan
    Shadlu, Milad Samady
    CHINESE JOURNAL OF ELECTRICAL ENGINEERING, 2023, 9 (04): : 107 - 121