Local Decision Pitfalls in Interactive Machine Learning: An Investigation into Feature Selection in Sentiment Analysis

被引:21
作者
Wu, Tongshuang [1 ]
Weld, Daniel [1 ]
Heer, Jeffrey [1 ]
机构
[1] Univ Washington, 185 E Stevens Way NE, Seattle, WA 98195 USA
关键词
Machine learning; text classification; performance analysis;
D O I
10.1145/3319616
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Tools for Interactive Machine Learning (IML) enable end users to update models in a "rapid, focused, and incremental"-yet local-manner. In this work, we study the question of local decision making in an IML context around feature selection for a sentiment classification task. Specifically, we characterize the utility of interactive feature selection through a combination of human-subjects experiments and computational simulations. We find that, in expectation, interactive modification fails to improve model performance and may hamper generalization due to overfitting. We examine how these trends are affected by the dataset, learning algorithm, and the training set size. Across these factors we observe consistent generalization issues. Our results suggest that rapid iterations with IML systems can be dangerous if they encourage local actions divorced from global context, degrading overall model performance. We conclude by discussing the implications of our feature selection results to the broader area of IML systems and research.
引用
收藏
页码:1 / 27
页数:27
相关论文
共 67 条
[1]   ModelTracker: Redesigning Performance Analysis Tools for Machine Learning [J].
Amershi, Saleema ;
Chickering, Max ;
Drucker, Steven M. ;
Lee, Bongshin ;
Simard, Patrice ;
Suh, Jina .
CHI 2015: PROCEEDINGS OF THE 33RD ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2015, :337-346
[2]   Power to the People: The Role of Humans in Interactive Machine Learning [J].
Amershi, Saleema ;
Cakmak, Maya ;
Knox, W. Bradley ;
Kulesza, Todd .
AI MAGAZINE, 2014, 35 (04) :105-120
[3]  
Amershi S, 2009, UIST 2009: PROCEEDINGS OF THE 22ND ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, P247
[4]  
Amershi S, 2010, CHI2010: PROCEEDINGS OF THE 28TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P1357
[5]  
Amershi Saleema, 2012, P SIGCHI C HUM FACT, P21, DOI DOI 10.1145/2207676.2207680
[6]  
[Anonymous], P 4 WORKSH VIS TEXT
[7]  
[Anonymous], 2006, P NAT C ART INT
[8]  
[Anonymous], IEEE T VISUALIZATION
[9]  
[Anonymous], 2006, P 3 C EM ANT CEAS 20
[10]  
[Anonymous], ABS14094814 CORR