Integrated multimodal human-computer interface and augmented reality for interactive display applications

被引：1

作者：

Vassiliou, MS ^{[1
]}

Sundareswaran, V ^{[1
]}

Chen, S ^{[1
]}

Behringer, R ^{[1
]}

Tam, C ^{[1
]}

Chan, M ^{[1
]}

Bangayan, P ^{[1
]}

McGee, J ^{[1
]}

机构：

[1] Rockwell Int Sci Ctr, Thousand Oaks, CA 91360 USA

来源：

COCKPIT DISPLAYS VII: DISPLAYS FOR DEFENSE APPLICATIONS | 2000年 / 4022卷

关键词：

human-computer interface; speech recognition; 3D Audio; eyetracking; multimodal integration; lip reading; multimedia; augmented reality; tactical operation center; wearable computing;

D O I：

10.1117/12.397779

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

We describe new systems for improved integrated multimodal human-computer interaction and augmented reality for a diverse array of applications, including future advanced cockpits, tactical operations centers, and others. We have developed an integrated display system featuring: speech recognition of multiple concurrent users equipped with both standard air-coupled microphones and novel throat-coupled sensors (developed at Army Research Labs for increased noise immunity); lip reading for improving speech recognition accuracy in noisy environments, three-dimensional spatialized audio for improved display of warnings, alerts, and other information; wireless, coordinated handheld-PC control of a large display; real-time display of data and inferences from wireless integrated networked sensors with on-board signal processing and discrimination; gesture control with disambiguated point-and-speak capability; head- and eye-tracking coupled with speech recognition for "look-and-speak" interaction; and integrated tetherless augmented reality on a wearable computer. The various interaction modalities (speech recognition, 3D audio, eyetracking, etc.) are implemented as "modality servers" in an Internet-based client-server architecture. Each modality server encapsulates and exposes commercial and research software packages, presenting a socket network interface that is abstracted to a high-level interface, minimizing both vendor dependencies and required changes on the client side as the server's technology improves.

引用

页码：106 / 115

页数：10

共 8 条

[1]

Behringer R, 1999, AUGMENTED REALITY, P225

[2]

Blauert J., 1997, SPATIAL HEARING PSYC

[3]

CHAN MT, 1998, P IEEE SIGN PROC SOC, P3733

[4]

Jacob R.J., 1993, Advances in humancomputer interaction, V4, P151

[5]

MARCY HO, 1999, P 1999 AIAA M

[6]

Rabiner L., 1993, Fundamentals of Speech Recognition

[7]

RUDMANN DS, 1999, P 3 ANN ARL FED LAB, P91

[8] LOCALIZATION USING NONINDIVIDUALIZED HEAD-RELATED TRANSFER-FUNCTIONS [J].

WENZEL, EM ;

ARRUDA, M ;

KISTLER, DJ ;

WIGHTMAN, FL .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 94 (01) :111-123

← 1 →