Speech-driven mobile games for speech therapy: User experiences and feasibility

被引:44
作者
Ahmed, Beena [1 ,2 ]
Monroe, Penelope [3 ]
Hair, Adam [4 ]
Tan, Chek Tien [5 ]
Gutierrez-Osuna, Ricardo [4 ]
Ballard, Kirrie J. [3 ]
机构
[1] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
[2] Texas A&M Univ Qatar, Sch Elect Engn & Telecommun, Doha, Qatar
[3] Univ Sydney, Fac Hlth Sci, Sydney, NSW, Australia
[4] Texas A&M Univ, Dept Comp Sci & Engn, Coll Engn, College Stn, TX 77843 USA
[5] Univ Technol, Games Studio Dept, Sydney, NSW, Australia
基金
澳大利亚研究理事会;
关键词
speech-controlled games; mobile therapy apps; ASR applications; ASR in games; childhood apraxia of speech; CHILDREN; WORDS;
D O I
10.1080/17549507.2018.1513562
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Purpose: To assist in remote treatment, speech-language pathologists (SLPs) rely on mobile games, which though entertaining, lack feedback mechanisms. Games integrated with automatic speech recognition (ASR) offer a solution where speech productions control gameplay. We therefore performed a feasibility study to assess children's and SLPs' experiences towards speech-controlled games, game feature preferences and ASR accuracy. Method: Ten children with childhood apraxia of speech (CAS), six typically developing (TD) children and seven SLPs trialled five games and answered questionnaires. Researchers also compared the results of ASR to perceptual judgment. Result: Children and SLPs found speech-controlled games interesting and fun, despite ASR-human disagreements. They preferred games with rewards, challenge and multiple difficulty levels. Automatic speech recognition-human agreement was higher for SLPs than children, similar between TD and CAS and unaffected by CAS severity (77% TD, 75% CAS - incorrect; 51% TD, 47% CAS, 71% SLP - correct). Manual stop recording yielded higher agreement than automatic. Word length did not influence agreement. Conclusion: Children's and SLPs' positive responses towards speech-controlled games suggest that they can engage children in higher intensity practice. Our findings can guide future improvements to the ASR, recording methods and game features to improve the user experience and therapy adherence.
引用
收藏
页码:644 / 658
页数:15
相关论文
共 39 条
[1]  
American Speech-Language-Hearing Association, 2017, APPS SPEECH LANG PAT
[2]  
Augmentera Ltd, 2005, VIEWRANGER OUTD GPS
[3]  
Balbus Speech, 2017, SPEECH4GOOD MOB APPL
[4]  
BALTER O, 2005, P 7 INT ACM SIGACCES
[5]  
Expressive Solutions, 2018, ARTIKPIX VERS 3 1 1
[6]   Morphology and development of the human vocal tract: A study using magnetic resonance imaging [J].
Fitch, WT ;
Giedd, J .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (03) :1511-1522
[7]  
Ganzeboom M., 2016, Proceedings of the 2016 ACM Workshop on Multimedia for Personal Health and Health Care, P3
[8]   Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates [J].
Goldwater, Sharon ;
Jurafsky, Dan ;
Manning, Christopher D. .
SPEECH COMMUNICATION, 2010, 52 (03) :181-200
[9]  
IdeaMK, 2017, WHACK MOL VERS 1 8 M
[10]  
Jamieson Donald G, 2004, J Am Acad Audiol, V15, P508, DOI 10.3766/jaaa.15.7.5