AnchorViz: Facilitating Semantic Data Exploration and Concept Discovery for Interactive Machine Learning

被引:16
作者
Suh, Jina [1 ]
Ghorashi, Soroush [1 ]
Ramos, Gonzalo [1 ]
Chen, Nan-Chen [2 ]
Drucker, Steven [1 ]
Verwey, Johan [1 ]
Simard, Patrice [1 ]
机构
[1] Microsoft Res, 1 Microsoft Way, Redmond, WA 98052 USA
[2] Univ Washington, Seattle, WA 98195 USA
关键词
Interactive machine learning; visualization; error discovery; semantic data exploration; unlabeled data; concept discovery; machine teaching; OF-THE-ART; VISUALIZATION;
D O I
10.1145/3241379
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When building a classifier in interactive machine learning (iML), human knowledge about the target class can be a powerful reference to make the classifier robust to unseen items. The main challenge lies in finding unlabeled items that can either help discover or refine concepts for which the current classifier has no corresponding features (i.e., it has feature blindness). Yet it is unrealistic to ask humans to come up with an exhaustive list of items, especially for rare concepts that are hard to recall. This article presents AnchorViz, an interactive visualization that facilitates the discovery of prediction errors and previously unseen concepts through human-driven semantic data exploration. By creating example-based or dictionary-based anchors representing concepts, users create a topology that (a) spreads data based on their similarity to the concepts and (b) surfaces the prediction and label inconsistencies between data points that are semantically related. Once such inconsistencies and errors are discovered, users can encode the new information as labels or features and interact with the retrained classifier to validate their actions in an iterative loop. We evaluated AnchorViz through two user studies. Our results show that AnchorViz helps users discover more prediction errors than stratified random and uncertainty sampling methods. Furthermore, during the beginning stages of a training task, an iML tool with AnchorViz can help users build classifiers comparable to the ones built with the same tool with uncertainty sampling and keyword search, but with fewer labels and more generalizable features. We discuss exploration strategies observed during the two studies and how AnchorViz supports discovering, labeling, and refining of concepts through a sensemaking loop.
引用
收藏
页数:38
相关论文
共 70 条
[1]   Adaptive visualization of search results: Bringing user models to visual analytics [J].
Ahn, Jae-Wook ;
Brusilovsky, Peter .
INFORMATION VISUALIZATION, 2009, 8 (03) :167-179
[2]   ModelTracker: Redesigning Performance Analysis Tools for Machine Learning [J].
Amershi, Saleema ;
Chickering, Max ;
Drucker, Steven M. ;
Lee, Bongshin ;
Simard, Patrice ;
Suh, Jina .
CHI 2015: PROCEEDINGS OF THE 33RD ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2015, :337-346
[3]   Power to the People: The Role of Humans in Interactive Machine Learning [J].
Amershi, Saleema ;
Cakmak, Maya ;
Knox, W. Bradley ;
Kulesza, Todd .
AI MAGAZINE, 2014, 35 (04) :105-120
[4]  
Amershi Saleema, 2012, P 2012 ACM ANN C HUM, P21, DOI DOI 10.1145/2207676.2207680
[5]  
[Anonymous], 1981, Categories and concepts
[6]  
[Anonymous], 1999, Readings in Information Visualization: Using Vision To Think
[7]  
[Anonymous], 2018, IEEE T VISUALIZATION
[8]  
[Anonymous], 2011, P 19 EUR S ART NEUR
[9]  
[Anonymous], 1997, Icml
[10]  
Attenberg J, 2010, P 16 ACM SIGKDD INT, P423