Local Decision Pitfalls in Interactive Machine Learning: An Investigation into Feature Selection in Sentiment Analysis

被引:21
作者
Wu, Tongshuang [1 ]
Weld, Daniel [1 ]
Heer, Jeffrey [1 ]
机构
[1] Univ Washington, 185 E Stevens Way NE, Seattle, WA 98195 USA
关键词
Machine learning; text classification; performance analysis;
D O I
10.1145/3319616
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Tools for Interactive Machine Learning (IML) enable end users to update models in a "rapid, focused, and incremental"-yet local-manner. In this work, we study the question of local decision making in an IML context around feature selection for a sentiment classification task. Specifically, we characterize the utility of interactive feature selection through a combination of human-subjects experiments and computational simulations. We find that, in expectation, interactive modification fails to improve model performance and may hamper generalization due to overfitting. We examine how these trends are affected by the dataset, learning algorithm, and the training set size. Across these factors we observe consistent generalization issues. Our results suggest that rapid iterations with IML systems can be dangerous if they encourage local actions divorced from global context, degrading overall model performance. We conclude by discussing the implications of our feature selection results to the broader area of IML systems and research.
引用
收藏
页码:1 / 27
页数:27
相关论文
共 67 条
[31]   Effects of prior knowledge on decisions made under perceptual vs. categorical uncertainty [J].
Hansen, Kathleen A. ;
Hillenbrand, Sarah F. ;
Ungerleider, Leslie G. .
FRONTIERS IN NEUROSCIENCE, 2012, 6
[32]  
Hardt M, 2016, ADV NEUR IN, V29
[33]   Visual Classifier Training for Text Document Retrieval [J].
Heimerl, Florian ;
Koch, Steffen ;
Bosch, Harald ;
Ertl, Thomas .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2012, 18 (12) :2839-2848
[34]   User Trust in Intelligent Systems: A Journey Over Time [J].
Holliday, Daniel ;
Wilson, Stephanie ;
Stumpf, Simone .
PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES (IUI'16), 2016, :164-168
[35]   iPCA: An Interactive System for PCA-based Visual Analytics [J].
Jeong, Dong Hyun ;
Ziemkiewicz, Caroline ;
Fisher, Brian ;
Ribarsky, William ;
Chang, Remco .
COMPUTER GRAPHICS FORUM, 2009, 28 (03) :767-774
[36]  
KENT JT, 1983, BIOMETRIKA, V70, P163, DOI 10.1093/biomet/70.1.163
[37]  
Kotsiantis S.B., 2007, P 2007 C EM ART INT
[38]   Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models [J].
Krause, Josua ;
Perer, Adam ;
Ng, Kenney .
34TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2016, 2016, :5686-5697
[39]   INFUSE: Interactive Feature Selection for Predictive Modeling of High Dimensional Data [J].
Krause, Josua ;
Perer, Adam ;
Bedini, Enrico .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (12) :1614-1623
[40]  
Kulesza T., 2014, P SIGCHI C HUM FACT, P3075