An Assistive Bi-modal User Interface Integrating Multi-channel Speech Recognition and Computer Vision

被引:0
作者
Karpov, Alexey [1 ]
Ronzhin, Andrey [1 ]
Kipyatkova, Irina [1 ]
机构
[1] Russian Acad Sci, St Petersburg Inst Informat & Automat, SPIIRAS, St Petersburg 199178, Russia
来源
HUMAN-COMPUTER INTERACTION: INTERACTION TECHNIQUES AND ENVIRONMENTS, PT II | 2011年 / 6762卷
关键词
Multi-modal user interface; assistive technology; speech recognition; computer vision; cognitive experiments;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a bi-modal user interface aimed both for assistance to persons without hands or with physical disabilities of hands/arms, and for contactless HCI with able-bodied users as well. Human being can manipulate a virtual mouse pointer moving his/her head and verbally communicate with a computer, giving speech commands instead of computer input devices. Speech is a very useful modality to reference objects and actions on objects, whereas head pointing gesture/motion is a powerful modality to indicate spatial locations. The bi-modal interface integrates a tri-lingual system for multi-channel audio signal processing and automatic recognition of voice commands in English, French and Russian as well as a vision-based head detection/tracking system. It processes natural speech and head pointing movements in parallel and fuses both informational streams in a united multimodal command, where each modality transmits own semantic information: head position indicates 2D head/pointer coordinates, while speech signal yields control commands. Testing of the bi-modal user interface and comparison with contact-based pointing interfaces was made by the methodology of ISO 9241-9.
引用
收藏
页码:454 / 463
页数:10
相关论文
共 24 条
[1]  
[Anonymous], SPIIRAS SPEECH MULT
[2]  
Bolt R. A., 1980, Computer Graphics, V14, P262, DOI 10.1145/965105.807503
[3]  
Bouguet J-Y, 1999, PYRAMIDAL IMPLEMENTA
[4]  
CARBINI S, 2006, 2 IASTED INT C HUM C, P226
[5]  
De Silva G.C., 2003, INT WORKSH COMP VIS
[6]   Nouse 'use your nose as a mouse' perceptual vision technology for hands-free games and interfaces [J].
Gorodnichy, DO ;
Roth, G .
IMAGE AND VISION COMPUTING, 2004, 22 (12) :931-942
[7]  
Harada Susumu., 2006, ASSETS 06, P197, DOI DOI 10.1145/1168987.1169021
[8]  
*INT STAND ORG, 2000, 924192000E ISO
[9]  
Ito E, 2001, HUMAN-COMPUTER INTERACTION - INTERACT'01, P727
[10]   ICANDO: LOW COST MULTIMODAL INTERFACE FOR HAND DISABLED PEOPLE [J].
Karpov, Alexey ;
Ronzhin, Andrey .
JOURNAL ON MULTIMODAL USER INTERFACES, 2007, 1 (02) :21-29