Constraining user response via multimodal dialog interface

被引:3
作者
Baker K. [1 ]
Mckenzie A. [2 ]
Biermann A. [2 ]
Webelhuth G. [3 ]
机构
[1] Linguistics Department, Ohio State University, 222 Oxley Hall, Columbus, OH 43210-1298
[2] Department of Computer Science, Duke University, Durham, NC 27708-0129
[3] Sem. für Englische Philologie, Georg-August-Univ. Göttingen, 37073 Göttingen
关键词
Constrain user response; Multimodal dialog interface; Speech recognition;
D O I
10.1023/B:IJST.0000037069.82313.57
中图分类号
学科分类号
摘要
This paper presents the results of an experiment comparing two different designs of an automated dialog interface. We compare a multimodal design utilizing text displays coordinated with spoken prompts to a voice-only version of the same application. Our results show that the text-coordinated version is more efficient in terms of word recognition and number of out-of-grammar responses, and is equal to the voice-only version in terms of user satisfaction. We argue that this type of multimodal dialog interface effectively constrains user response to allow for better speech recognition without increasing cognitive load or compromising the naturalness of the interaction.
引用
收藏
页码:251 / 258
页数:7
相关论文
共 50 条
[41]   Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding [J].
Zhang, Tian-Hao ;
Qin, Hai-Bo ;
Lai, Zhi-Hao ;
Chen, Song-Lu ;
Liu, Qi ;
Chen, Feng ;
Qian, Xinyuan ;
Yin, Xu-Cheng .
INTERSPEECH 2023, 2023, :914-918
[42]   A Multimodal Human Computer Interface Combining Head Movement, Speech and Tongue Motion for People with Severe Disabilities [J].
Sahadat, M. N. ;
Alreja, Arish ;
Srikrishnan, Pooja ;
Ghovanloo, Maysam .
2015 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS), 2015, :157-160
[43]   Comparative Analysis of Adapted Foreign Language and Native Lithuanian Speech Recognizers for Voice User Interface [J].
Rudzionis, V. ;
Raskinis, G. ;
Maskeliunas, R. ;
Rudzionis, A. ;
Ratkevicius, K. .
ELEKTRONIKA IR ELEKTROTECHNIKA, 2013, 19 (07) :90-93
[44]   Fast vocabulary acquisition in an NMF-based self-learning vocal user interface [J].
Ons, Bart ;
Gemmeke, Jort F. ;
Van Hamme, Hugo .
COMPUTER SPEECH AND LANGUAGE, 2014, 28 (04) :997-1017
[45]   ModalityMirror: Enhancing Audio Classification in Modality Heterogeneity Federated Learning via Multimodal Distillation [J].
Feng, Tiantian ;
Zhang, Tuo ;
Avestimehr, Salman ;
Narayanan, Shrikanth .
PROCEEDINGS OF THE 2025 THE 35TH EDITION OF THE WORKSHOP ON NETWORK AND OPERATING SYSTEM SUPPORT FOR DIGITAL AUDIO AND VIDEO, NOSSDAV 2025, 2025, :78-83
[46]   DESIGN AND IMPLEMENTATION OF A USER-ORIENTED SPEECH RECOGNITION INTERFACE - THE SYNERGY OF TECHNOLOGY AND HUMAN-FACTORS [J].
KLOOSTERMAN, SH .
INTERACTING WITH COMPUTERS, 1994, 6 (01) :41-60
[48]   Development of a graphical user interface for automatic separation of human voice from Doppler ultrasound audio in diving experiments [J].
Azarang, Arian ;
Blogg, S. Lesley ;
Currens, Joshua ;
Lance, Rachel M. ;
Moon, Richard E. ;
Lindholm, Peter ;
Papadopoulou, Virginie .
PLOS ONE, 2023, 18 (08)
[49]   An Assistive Bi-modal User Interface Integrating Multi-channel Speech Recognition and Computer Vision [J].
Karpov, Alexey ;
Ronzhin, Andrey ;
Kipyatkova, Irina .
HUMAN-COMPUTER INTERACTION: INTERACTION TECHNIQUES AND ENVIRONMENTS, PT II, 2011, 6762 :454-463
[50]   An Innovative Speech-Based User Interface for Smarthomes and IoT Solutions to Help People with Speech and Motor Disabilities [J].
Malavasi, Massimiliano ;
Turri, Enrico ;
Atria, Jose Joaquin ;
Christensen, Heidi ;
Marxer, Ricard ;
Desideri, Lorenzo ;
Coy, Andre ;
Tamburini, Fabio ;
Green, Phil .
HARNESSING THE POWER OF TECHNOLOGY TO IMPROVE LIVES, 2017, 242 :306-313