How productivity improves in hands-free continuous dictation tasks: lessons learned from a longitudinal study

被引:10
作者
Feng, JJ [1 ]
Karat, CM [1 ]
Sears, A [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY 10532 USA
基金
美国国家科学基金会;
关键词
automatic speech recognition technologies; error correction; speech recognition software;
D O I
10.1016/j.intcom.2004.06.013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Speech recognition technology continues to improve, but users still experience significant difficulty using the software to create and edit documents. The reported composition speed using speech software is only between 8 and 15 words per minute [Proc CHI 99 (1999) 568; Universal Access Inform Soc 1 (2001) 4], much lower than people's normal speaking speed of 125-150 words per minute. What causes the huge gap between natural speaking and composing using speech recognition? Is it possible to narrow the gap and make speech recognition more promising to users? In this paper we discuss users' learning processes and the difficulties they experience as related to continuous dictation tasks using state of the art Automatic Speech Recognition (ASR) software. Detailed data was collected for the first time on various aspects of the three activities involved in document composition tasks: dictation, navigation, and correction. The results indicate that navigation and error correction accounted for big chunk of the dictation task during the early stages of interaction. As users gained more experience, they became more efficient at dictation, navigation and error correction. However, the major improvements in productivity were due to dictation quality and the usage of navigation commands. These results provide insights regarding the factors that cause the gap between user expectation with speech recognition software and the reality of use, and how those factors changed with experience. Specific advice is given to researchers as to the most critical issues that must be addressed. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:265 / 289
页数:25
相关论文
共 28 条
[1]   OPTIMIZATION OF STRING LENGTH FOR SPOKEN DIGIT INPUT WITH ERROR CORRECTION [J].
AINSWORTH, WA .
INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1988, 28 (06) :573-581
[2]   FEEDBACK-STRATEGIES FOR ERROR CORRECTION IN SPEECH RECOGNITION SYSTEMS [J].
AINSWORTH, WA ;
PRATT, SR .
INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1992, 36 (06) :833-842
[3]   MODELING ERROR RECOVERY AND REPAIR IN AUTOMATIC SPEECH RECOGNITION [J].
BABER, C ;
HONE, KS .
INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1993, 39 (03) :495-515
[4]  
BURMEISTER M, 1997, CHI 97, P36
[5]   THE ROLE OF VOICE INPUT FOR HUMAN-MACHINE COMMUNICATION [J].
COHEN, PR ;
OVIATT, SL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (22) :9921-9927
[6]  
DEMAURO C, 2001, EASY ACCESS GRAPHICA
[7]  
FENG J, 2002, UMBC TECHNICAL REPOR
[8]  
Halverson CA, 1999, HUMAN-COMPUTER INTERACTION - INTERACT '99, P133
[9]  
HAUPTMANN AG, 1989, P C HUM FACT COMP SY, P241
[10]  
Karat C., 1999, P SIGCHI C HUMAN FAC, P568, DOI DOI 10.1145/302979.303160