How Good is 85%? A Survey Tool to Connect Classifier Evaluation to Acceptability of Accuracy

被引:39
作者
Kay, Matthew [1 ]
Patel, Shwetak N. [1 ]
Kientz, Julie A. [2 ]
机构
[1] Univ Washington, Comp Sci & Engn Dub, Seattle, WA 98195 USA
[2] Univ Washington, Human Ctr Design & Engn Dub, Seattle, WA 98195 USA
来源
CHI 2015: PROCEEDINGS OF THE 33RD ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS | 2015年
关键词
Classifiers; accuracy; accuracy acceptability; inference; machine learning; sensors; INFORMATION; ACCEPTANCE;
D O I
10.1145/2702123.2702603
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many HCI and ubiquitous computing systems are characterized by two important properties: their output is uncertain- it has an associated accuracy that researchers attempt to optimize-and this uncertainty is user-facing-it directly affects the quality of the user experience. Novel classifiers are typically evaluated using measures like the F-1 score-but given an F-score of (e.g.) 0.85, how do we know whether this performance is good enough? Is this level of uncertainty actually tolerable to users of the intended application- and do people weight precision and recall equally? We set out to develop a survey instrument that can systematically answer such questions. We introduce a new measure, acceptability of accuracy, and show how to predict it based on measures of classifier accuracy. Out tool allows us to systematically select an objective function to optimize during classifier evaluation, but can also offer new insights into how to design feedback for user-facing classification systems (e.g., by combining a seemingly-low-performing classifier with appropriate feedback to make a highly usable system). It also reveals potential issues with the ubiquitous F-1-measure as applied to user-facing systems.
引用
收藏
页码:347 / 356
页数:10
相关论文
共 29 条
[1]  
[Anonymous], 2011, Doing Bayesian Data Analysis: A Tutorial with R and BUGS.
[2]  
[Anonymous], 2012, NY TIMES
[3]  
Antifakos S., 2004, UBICOMP 04
[4]  
Bellotti V., 2002, Conference Proceedings. Conference on Human Factors in Computing Systems. CHI 2002, P415, DOI 10.1145/503376.503450
[5]  
Benford S., 2005, ACM Transactions on Computer-Human Interaction, V12, P3, DOI 10.1145/1057237.1057239
[6]  
Choe EK, 2012, UBICOMP'12: PROCEEDINGS OF THE 2012 ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING, P61
[7]   Conducting in situ evaluations for and with ubiquitous computing technologies [J].
Consolvo, Sunny ;
Harrison, Beverly ;
Smith, Ian ;
Chen, Mike Y. ;
Everitt, Katherine ;
Froehlich, Jon ;
Landay, James A. .
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2007, 22 (1-2) :103-118
[8]  
Consolvo S, 2008, CHI 2008: 26TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS VOLS 1 AND 2, CONFERENCE PROCEEDINGS, P1797
[10]  
Froehlich J., 2009, UBICOMP 09