Constraining user response via multimodal dialog interface

被引:3
作者
Baker K. [1 ]
Mckenzie A. [2 ]
Biermann A. [2 ]
Webelhuth G. [3 ]
机构
[1] Linguistics Department, Ohio State University, 222 Oxley Hall, Columbus, OH 43210-1298
[2] Department of Computer Science, Duke University, Durham, NC 27708-0129
[3] Sem. für Englische Philologie, Georg-August-Univ. Göttingen, 37073 Göttingen
关键词
Constrain user response; Multimodal dialog interface; Speech recognition;
D O I
10.1023/B:IJST.0000037069.82313.57
中图分类号
学科分类号
摘要
This paper presents the results of an experiment comparing two different designs of an automated dialog interface. We compare a multimodal design utilizing text displays coordinated with spoken prompts to a voice-only version of the same application. Our results show that the text-coordinated version is more efficient in terms of word recognition and number of out-of-grammar responses, and is equal to the voice-only version in terms of user satisfaction. We argue that this type of multimodal dialog interface effectively constrains user response to allow for better speech recognition without increasing cognitive load or compromising the naturalness of the interaction.
引用
收藏
页码:251 / 258
页数:7
相关论文
共 50 条
[31]   Multimodal Gesture Recognition via Multiple Hypotheses Rescoring [J].
Pitsikalis, Vassilis ;
Katsamanis, Athanasios ;
Theodorakis, Stavros ;
Maragos, Petros .
JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 :255-284
[32]   Estimating the user's state before exchanging utterances using intermediate acoustic features for spoken dialog systems [J].
Chiba, Yuya ;
Nose, Takashi ;
Ito, Masashi ;
Ito, Akinori .
IAENG International Journal of Computer Science, 2016, 43 (01) :1-9
[33]   From a Wizard of Oz experiment to a real time speech and gesture multimodal interface [J].
Carbini, S. ;
Delphin-Poulat, L. ;
Perron, L. ;
Viallet, J. E. .
SIGNAL PROCESSING, 2006, 86 (12) :3559-3577
[34]   "Hey CAI" - Conversational AI Enabled User Interface for HPC Tools [J].
Kousha, Pouya ;
Jain, Arpan ;
Kolli, Ayyappa ;
Sainath, Prasanna ;
Subramoni, Hari ;
Shafi, Aamir ;
Panda, Dhableswar K. .
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2022, 2022, 13289 :87-108
[35]   Integrated multimodal human-computer interface and augmented reality for interactive display applications [J].
Vassiliou, MS ;
Sundareswaran, V ;
Chen, S ;
Behringer, R ;
Tam, C ;
Chan, M ;
Bangayan, P ;
McGee, J .
COCKPIT DISPLAYS VII: DISPLAYS FOR DEFENSE APPLICATIONS, 2000, 4022 :106-115
[36]   Multimodal Interface Based on Novel HMI UI/UX for In-Vehicle Infotainment System [J].
Kim, Jinwoo ;
Ryu, Jae Hong ;
Han, Tae Man .
ETRI JOURNAL, 2015, 37 (04) :793-803
[37]   A Cognitive User Interface for a Multi-modal Human-Machine Interaction [J].
Tschoepe, Constanze ;
Duckhorn, Frank ;
Huber, Markus ;
Meyer, Werner ;
Wolff, Matthias .
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 :707-717
[38]   Unvoiced: Designing an LLM-assisted Unvoiced User Interface using Earables [J].
Srivastava, Tanmay ;
Khanna, Prerna ;
Pan, Shijia ;
Phuc Nguyen ;
Jain, Shubham .
PROCEEDINGS OF THE 2024 ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2024, 2024, :784-798
[39]   An Integrated Model of Voice-User Interface Continuance Intention: The Gender Effect [J].
Nguyen, Quynh N. ;
Anh Ta ;
Prybutok, Victor .
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2019, 35 (15) :1362-1377
[40]   A user interface-level integration method for multiple automatic speech translation systems [J].
Osada, Seiya ;
Yamabana, Kiyoshi ;
Hanazawa, Ken ;
Okumura, Akitoshi .
PACLIC 20: PROCEEDINGS OF THE 20TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2006, :72-79