A silent speech system based on permanent magnet articulography and direct synthesis

被引：40

作者：

Gonzalez, Jose A. ^{[1
]}

Cheah, Lam A. ^{[2
]}

Gilbert, James M. ^{[2
]}

Bai, Jie ^{[2
]}

Ell, Stephen R. ^{[3
]}

Green, Phil D. ^{[1
]}

Moore, Roger K. ^{[1
]}

机构：

[1] Univ Sheffield, Dept Comp Sci, Sheffield S10 2TN, S Yorkshire, England

[2] Univ Hull, Sch Engn, Kingston Upon Hull, Yorks, England

[3] Hull & East Yorkshire Hosp Trust, Castle Hill Hosp, Cottingham, England

来源：

COMPUTER SPEECH AND LANGUAGE | 2016年 / 39卷

基金：

美国国家卫生研究院;

关键词：

Silent speech interfaces; Speech rehabilitation; Speech synthesis; Permanent magnet articulography; Augmentative and alternative communication; MAXIMUM-LIKELIHOOD-ESTIMATION; VOICE CONVERSION; VOCAL-TRACT; RECOGNITION; EXTRACTION;

D O I：

10.1016/j.csl.2016.02.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we present a silent speech interface (SSI) system aimed at restoring speech communication for individuals who have lost their voice due to laryngectomy or diseases affecting the vocal folds. In the proposed system, articulatory data captured from the lips and tongue using permanent magnet articulography (PMA) are converted into audible speech using a speaker-dependent transformation learned from simultaneous recordings of PMA and audio signals acquired before laryngectomy. The transformation is represented using a mixture of factor analysers, which is a generative model that allows us to efficiently model non-linear behaviour and perform dimensionality reduction at the same time. The learned transformation is then deployed during normal usage of the SSI to restore the acoustic speech signal associated with the captured PMA data. The proposed system is evaluated using objective quality measures and listening tests on two databases containing PMA and audio recordings for normal speakers. Results show that it is possible to reconstruct speech from articulator movements captured by an unobtrusive technique without an intermediate recognition step. The SSI is capable of producing speech of sufficient intelligibility and naturalness that the speaker is clearly identifiable, but problems remain in scaling up the process to function consistently for phonetically rich vocabularies. (C) 2016 Elsevier Ltd. All rights reserved.

引用

页码：67 / 87

页数：21

共 50 条

[31] Spotting words in silent speech videos: a retrieval-based approach
Jha, Abhishek
Namboodiri, Vinay P.
Jawahar, C. V.
MACHINE VISION AND APPLICATIONS, 2019, 30 (02) : 217 - 229
[32] Ultrasonic Doppler Based Silent Speech Interface Using Perceptual Distance
Lee, Ki-Seung
APPLIED SCIENCES-BASEL, 2022, 12 (02):
[33] Eigentongue feature extraction for an ultrasound-based silent speech interface
Hueber, T.
Aversano, G.
Chollet, G.
Denby, B.
Dreyfus, G.
Oussar, Y.
Roussel, P.
Stone, M.
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 1245 - +
[34] Artifact Removal Algorithm for an EMG-based Silent Speech Interface
Wand, Michael
Himmelsbach, Adam
Heistermann, Till
Janke, Matthias
Schultz, Tanja
2013 35TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2013, : 5750 - 5753
[35] Preliminary Test of a Wireless Magnetic Tongue Tracking System for Silent Speech Interface
Kim, Myungjong
Sebkhi, Nordine
Cao, Beiming
Ghovanloo, Maysam
Wang, Jun
2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 13 - 16
[36] Analysis of Wind Driven Permanent Magnet Synchronous Generator for Power Grid System
Raj, R. Essaki
Vasudevan, N.
Srinivasan, S.
Abirami, T.
Pandian, R.
Gnanavel, C.
Parkunam, N.
Srinivasan, R.
Hurissa, Dejene
INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, 2025, 2025 (01):
[37] AN ANALYSIS OF MACHINE TRANSLATION AND SPEECH SYNTHESIS IN SPEECH-TO-SPEECH TRANSLATION SYSTEM
Hashimoto, Kei
Yamagishi, Junichi
Byrne, William
King, Simon
Tokuda, Keiichi
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5108 - 5111
[38] Implementation of Telugu Speech Synthesis System
Ramya, Gangala
Naik, Nenavath Srinivas
2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1151 - 1154
[39] Secure Speech Encryption System Using Segments for Speech Synthesis
Kohata, Minoru
2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 264 - 267
[40] A Novel Modified Fuzzy-predictive Control of Permanent Magnet Synchronous Generator Based Wind Energy Conversion System
Akbari, Ehsan
Shadlu, Milad Samady
CHINESE JOURNAL OF ELECTRICAL ENGINEERING, 2023, 9 (04): : 107 - 121

← 1 2 3 4 5 →