DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps

被引:14
|
作者
Bertucci D. [1 ]
Hamid M.M. [1 ]
Anand Y. [1 ]
Ruangrotsakun A. [1 ]
Tabatabai D. [1 ]
Perez M. [1 ]
Kahng M. [1 ]
机构
[1] Oregon State University, United States
关键词
data-centric AI; error analysis; image data; treemaps; visual analytics; Visualization for machine learning;
D O I
10.1109/TVCG.2022.3209425
中图分类号
学科分类号
摘要
In this paper, we present DendroMap, a novel approach to interactively exploring large-scale image datasets for machine learning (ML). ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets because images are ineffectively organized and interactions are insufficiently supported. To address these challenges, we develop DendroMap by adapting Treemaps, a well-known visualization technique. DendroMap effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images. It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests at multiple levels of abstraction. Our case studies with widely-used image datasets for deep learning demonstrate that users can discover insights about datasets and trained models by examining the diversity of images, identifying underperforming subgroups, and analyzing classification errors. We conducted a user study that evaluates the effectiveness of DendroMap in grouping and searching tasks by comparing it with a gridified version of t-SNE and found that participants preferred DendroMap. © 2022 IEEE.
引用
收藏
页码:320 / 330
页数:10
相关论文
共 50 条
  • [41] Large-Scale Machine Learning and Neuroimaging in Psychiatry
    Thompson, Paul
    BIOLOGICAL PSYCHIATRY, 2018, 83 (09) : S51 - S51
  • [42] Coding for Large-Scale Distributed Machine Learning
    Xiao, Ming
    Skoglund, Mikael
    ENTROPY, 2022, 24 (09)
  • [43] Resource Elasticity for Large-Scale Machine Learning
    Huang, Botong
    Boehm, Matthias
    Tian, Yuanyuan
    Reinwald, Berthold
    Tatikonda, Shirish
    Reiss, Frederick R.
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 137 - 152
  • [44] TensorFlow: A system for large-scale machine learning
    Abadi, Martin
    Barham, Paul
    Chen, Jianmin
    Chen, Zhifeng
    Davis, Andy
    Dean, Jeffrey
    Devin, Matthieu
    Ghemawat, Sanjay
    Irving, Geoffrey
    Isard, Michael
    Kudlur, Manjunath
    Levenberg, Josh
    Monga, Rajat
    Moore, Sherry
    Murray, Derek G.
    Steiner, Benoit
    Tucker, Paul
    Vasudevan, Vijay
    Warden, Pete
    Wicke, Martin
    Yu, Yuan
    Zheng, Xiaoqiang
    PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2016, : 265 - 283
  • [45] Optimization Methods for Large-Scale Machine Learning
    Bottou, Leon
    Curtis, Frank E.
    Nocedal, Jorge
    SIAM REVIEW, 2018, 60 (02) : 223 - 311
  • [46] PRODUCT IMAGE REPRESENTATION LEARNING ON LARGE SCALE NOISY DATASETS
    Joshi, Aniket
    Das, Nilotpal
    Yenigalla, Promod
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2570 - 2574
  • [47] Learning Large-Scale Automatic Image Colorization
    Deshpande, Aditya
    Rock, Jason
    Forsyth, David
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 567 - 575
  • [48] Mesoscale explorer: Visual exploration of large-scale molecular models
    Rose, Alexander
    Sehnal, David
    Goodsell, David S.
    Autin, Ludovic
    PROTEIN SCIENCE, 2024, 33 (10)
  • [49] Interactive visual exploration of halos in large-scale cosmology simulation
    Guihua Shan
    Maojin Xie
    Feng’An Li
    Yang Gao
    Xuebin Chi
    Journal of Visualization, 2014, 17 : 145 - 156
  • [50] Interactive visual exploration of halos in large-scale cosmology simulation
    Shan, Guihua
    Xie, Maojin
    Li, Feng'An
    Gao, Yang
    Chi, Xuebin
    JOURNAL OF VISUALIZATION, 2014, 17 (03) : 145 - 156