Investigating User Estimation of Missing Data in Visual Analysis

被引:0
|
作者
Sun, Maoyuan [1 ]
Wang, Yuanxin [2 ]
Bolton, Courtney [1 ]
Ma, Yue [1 ]
Li, Tianyi [3 ]
Zhao, Jian [2 ]
机构
[1] Northern Illinois Univ, De Kalb, IL 60115 USA
[2] Univ Waterloo, Waterloo, ON, Canada
[3] Purdue Univ, W Lafayette, IN USA
来源
PROCEEDINGS OF THE 50TH GRAPHICS INTERFACE CONFERENCE, GI 2024 | 2024年
基金
加拿大自然科学与工程研究理事会;
关键词
Missing data; time series; visual analysis; UNCERTAINTY; IMPUTATION; VISUALIZATION; KNOWLEDGE; MODEL;
D O I
10.1145/3670947.3670977
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data is a pervasive issue in real-world analytics, stemming from a multitude of factors (e.g., device malfunctions and network disruptions), making it a ubiquitous challenge in many domains. Misperception of missing data impacts decision-making and causes severe consequences. To mitigate risks from missing data and facilitate proper handling, computing methods (e.g., imputation) have been studied, which often culminate in the visual representation of data for analysts to further check. Yet, the influence of these computed representations on user judgment regarding missing data remains unclear. To study potential influencing factors and their impact on user judgment, we conducted a crowdsourcing study. We controlled 4 factors: the distribution, imputation, and visualization of missing data, and the prior knowledge of data. We compared users' estimations of missing data with computed imputations under different combinations of these factors. Our results offer useful guidance for visualizing missing data and their imputations, which informs future studies on developing trustworthy computing methods for visual analysis of missing data.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Kernel Density Estimation with Missing Data: Misspecifying the Missing Data Mechanism
    Dubnicka, Suzanne R.
    NONPARAMETRIC STATISTICS AND MIXTURE MODELS: A FESTSCHRIFT IN HONOR OF THOMAS P HETTMANSPERGER, 2011, : 114 - 135
  • [22] Visual Analysis and Coding of Data-Rich User Behavior
    Blascheck, Tanja
    Beck, Fabian
    Baltes, Sebastian
    Ertl, Thomas
    Weiskopf, Daniel
    2016 IEEE CONFERENCE ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY (VAST), 2016, : 141 - 150
  • [23] Investigating missing data in Alzheimer disease studies
    Schott, Jonathan M.
    Bartlett, Jonathan W.
    NEUROLOGY, 2012, 78 (18) : 1370 - 1371
  • [24] Missing value analysis in user modeling
    Kosir, Andrej
    Kunaver, Matevz
    Tasic, Jurij
    Pogacnik, Matevz
    EUROCON 2007: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOLS 1-6, 2007, : 2086 - 2093
  • [25] Maximum likelihood estimation of missing data probability for nonmonotone missing at random data
    Zhao, Yang
    STATISTICAL METHODS AND APPLICATIONS, 2023, 32 (01): : 197 - 209
  • [26] Optimal pseudolikelihood estimation in the analysis of multivariate missing data with nonignorable nonresponse
    Zhao, Jiwei
    Ma, Yanyuan
    BIOMETRIKA, 2018, 105 (02) : 479 - 486
  • [27] Estimation of missing data in analysis of covariance: A least-squares approach
    Ogbonnaya, Chibueze E.
    Uzochukwu, Emeka C.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2016, 45 (07) : 1902 - 1909
  • [28] Analysis of three-dimensional grids: The estimation of two missing data
    Silver, G. L.
    APPLIED MATHEMATICS AND COMPUTATION, 2007, 184 (02) : 743 - 747
  • [29] Haplotype frequency estimation error analysis in the presence of missing genotype data
    Enda D Kelly
    Fabian Sievers
    Ross McManus
    BMC Bioinformatics, 5
  • [30] Maximum likelihood estimation of missing data probability for nonmonotone missing at random data
    Yang Zhao
    Statistical Methods & Applications, 2023, 32 : 197 - 209