A user-friendly headset for radar-based silent speech recognition

被引:1
作者
Digehsara, Pouriya Amini [1 ]
de Menezes, Joao Vitor Possamai [1 ]
Wagner, Christoph [1 ]
Baerhold, Michael [2 ]
Schaffer, Petr [2 ]
Plettemeier, Dirk [2 ]
Birkholz, Peter [1 ]
机构
[1] Tech Univ Dresden, Inst Acoust & Speech Commun, Dresden, Germany
[2] Tech Univ Dresden, Inst Commun Technol, Dresden, Germany
来源
INTERSPEECH 2022 | 2022年
关键词
silent speech interfaces; wearable headset; BiLSTM; radar imaging; speech-related biosignals;
D O I
10.21437/Interspeech.2022-10090
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Silent speech interfaces allow speech communication to take place in the absence of the acoustic speech signal. Radar-based sensing with radio antennas on the speakers' face can be used as a non-invasive modality to measure speech articulation in such applications. One of the major challenges with this approach is the variability between different sessions, mainly due to the repositioning of the antennas on the face of the speaker. In order to reduce the impact of this influencing factor, we developed a wearable headset that can be 3D-printed with flexible materials and weighs only about 69 g. For evaluation, a radar-based word recognition experiment was performed, where five speakers recorded a speech corpus in multiple sessions, alternatively with the headset and with double-sided tape to place the antennas on the face. By using a bidirectional long short-term memory network for classification, an average intersession word accuracy of 76.50% and 68.18% was obtained using the headset and the tape, respectively. This indicates that the antenna (re-) positioning accuracy with the headset is not worse than that with the double-sided tape while providing other benefits.
引用
收藏
页码:4835 / 4839
页数:5
相关论文
共 18 条
  • [1] Speech synthesis from ECoG using densely connected 3D convolutional neural networks
    Angrick, Miguel
    Herff, Christian
    Mugler, Emily
    Tate, Matthew C.
    Slutzky, Marc W.
    Krusienski, Dean J.
    Schultz, Tanja
    [J]. JOURNAL OF NEURAL ENGINEERING, 2019, 16 (03)
  • [2] Speech synthesis from neural decoding of spoken sentences
    Anumanchipalli, Gopala K.
    Chartier, Josh
    Chang, Edward F.
    [J]. NATURE, 2019, 568 (7753) : 493 - +
  • [3] Non-Invasive Silent Phoneme Recognition Using Microwave Signals
    Birkholz, Peter
    Stone, Simon
    Wolf, Klaus
    Plettemeier, Dirk
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2404 - 2411
  • [4] Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces
    Bocquelet, Florent
    Hueber, Thomas
    Girin, Laurent
    Savariaux, Christophe
    Yvert, Blaise
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (11)
  • [5] Digehsara P.A., 2021, STUDIENTEXTE SPRACHK, P112
  • [6] Eid A. M., 2009, IEEE ANTENNAS WIRELE, V8
  • [7] Development of a (silent) speech recognition system for patients following laryngectomy
    Fagan, M. J.
    Ell, S. R.
    Gilbert, J. M.
    Sarrazin, E.
    Chapman, P. M.
    [J]. MEDICAL ENGINEERING & PHYSICS, 2008, 30 (04) : 419 - 425
  • [8] Holzrichter J. F., 1998, J ACOUSTICAL SOC AM, V103
  • [9] Maier-Hein L, 2005, 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), P331
  • [10] Accuracy, recording interference, and articulatory quality of headsets for ultrasound recordings
    Pucher, Michael
    Klingler, Nicola
    Luttenberger, Jan
    Spreafico, Lorenzo
    [J]. SPEECH COMMUNICATION, 2020, 123 : 83 - 97