Machine Learning Methods for Automatic Silent Speech Recognition Using a Wearable Graphene Strain Gauge Sensor

被引：17

作者：

Ravenscroft, Dafydd ^{[1
]}

Prattis, Ioannis ^{[1
]}

Kandukuri, Tharun ^{[1
]}

Samad, Yarjan Abdul ^{[1
]}

Mallia, Giorgio ^{[1
]}

Occhipinti, Luigi G. ^{[1
]}

机构：

[1] Univ Cambridge, Dept Elect Engn, Cambridge CB3 0FA, England

来源：

SENSORS | 2022年 / 22卷 / 01期

基金：

英国工程与自然科学研究理事会;

关键词：

artificial neural networks; graphene; machine learning; silent speech recognition; strain gauge; PERFORMANCE;

D O I：

10.3390/s22010299

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Silent speech recognition is the ability to recognise intended speech without audio information. Useful applications can be found in situations where sound waves are not produced or cannot be heard. Examples include speakers with physical voice impairments or environments in which audio transference is not reliable or secure. Developing a device which can detect non-auditory signals and map them to intended phonation could be used to develop a device to assist in such situations. In this work, we propose a graphene-based strain gauge sensor which can be worn on the throat and detect small muscle movements and vibrations. Machine learning algorithms then decode the non-audio signals and create a prediction on intended speech. The proposed strain gauge sensor is highly wearable, utilising graphene's unique and beneficial properties including strength, flexibility and high conductivity. A highly flexible and wearable sensor able to pick up small throat movements is fabricated by screen printing graphene onto lycra fabric. A framework for interpreting this information is proposed which explores the use of several machine learning techniques to predict intended words from the signals. A dataset of 15 unique words and four movements, each with 20 repetitions, was developed and used for the training of the machine learning algorithms. The results demonstrate the ability for such sensors to be able to predict spoken words. We produced a word accuracy rate of 55% on the word dataset and 85% on the movements dataset. This work demonstrates a proof-of-concept for the viability of combining a highly wearable graphene strain gauge and machine leaning methods to automate silent speech recognition.

引用

页数：13

共 35 条

[1]

[Anonymous], 2011, LANCET, V377, P1977, DOI 10.1016/S0140-6736(11)60844-1

[2]

[Anonymous], ARXIV160903499

[3]

Assael Y.M., ARXIV161101599

[4]

Breiman L., 2001, Machine Learning, V45, P5

[5]

Caesarendra W, 2017, MACHINES, V5, DOI 10.3390/machines5040021

[6] Lip Reading Sentences in the Wild [J].

Chung, Joon Son ;

Senior, Andrew ;

Vinyals, Oriol ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3444-3450

[7] Lip Reading in the Wild [J].

Chung, Joon Son ;

Zisserman, Andrew .

COMPUTER VISION - ACCV 2016, PT II, 2017, 10112 :87-103

[8]

Grama L., 2017, P INT C SPEECH TECHN, P1

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10] Implementation and Comparison of Speech Emotion Recognition System using Gaussian Mixture Model (GMM) and K- Nearest Neighbor (K-NN) techniques [J].

Lanjewar, Rahul B. ;

Mathurkar, Swarup ;

Patel, Nilesh .

PROCEEDINGS OF 4TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND CONTROL(ICAC3'15), 2015, 49 :50-57

← 1 2 3 4 →