Investigating User Estimation of Missing Data in Visual Analysis

被引：0

作者：

Sun, Maoyuan ^{[1
]}

Wang, Yuanxin ^{[2
]}

Bolton, Courtney ^{[1
]}

Ma, Yue ^{[1
]}

Li, Tianyi ^{[3
]}

Zhao, Jian ^{[2
]}

机构：

[1] Northern Illinois Univ, De Kalb, IL 60115 USA

[2] Univ Waterloo, Waterloo, ON, Canada

[3] Purdue Univ, W Lafayette, IN USA

来源：

PROCEEDINGS OF THE 50TH GRAPHICS INTERFACE CONFERENCE, GI 2024 | 2024年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Missing data; time series; visual analysis; UNCERTAINTY; IMPUTATION; VISUALIZATION; KNOWLEDGE; MODEL;

D O I：

10.1145/3670947.3670977

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Missing data is a pervasive issue in real-world analytics, stemming from a multitude of factors (e.g., device malfunctions and network disruptions), making it a ubiquitous challenge in many domains. Misperception of missing data impacts decision-making and causes severe consequences. To mitigate risks from missing data and facilitate proper handling, computing methods (e.g., imputation) have been studied, which often culminate in the visual representation of data for analysts to further check. Yet, the influence of these computed representations on user judgment regarding missing data remains unclear. To study potential influencing factors and their impact on user judgment, we conducted a crowdsourcing study. We controlled 4 factors: the distribution, imputation, and visualization of missing data, and the prior knowledge of data. We compared users' estimations of missing data with computed imputations under different combinations of these factors. Our results offer useful guidance for visualizing missing data and their imputations, which informs future studies on developing trustworthy computing methods for visual analysis of missing data.

引用

页数：13

共 50 条

[21] Kernel Density Estimation with Missing Data: Misspecifying the Missing Data Mechanism
Dubnicka, Suzanne R.
NONPARAMETRIC STATISTICS AND MIXTURE MODELS: A FESTSCHRIFT IN HONOR OF THOMAS P HETTMANSPERGER, 2011, : 114 - 135
[22] Visual Analysis and Coding of Data-Rich User Behavior
Blascheck, Tanja
Beck, Fabian
Baltes, Sebastian
Ertl, Thomas
Weiskopf, Daniel
2016 IEEE CONFERENCE ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY (VAST), 2016, : 141 - 150
[23] Investigating missing data in Alzheimer disease studies
Schott, Jonathan M.
Bartlett, Jonathan W.
NEUROLOGY, 2012, 78 (18) : 1370 - 1371
[24] Missing value analysis in user modeling
Kosir, Andrej
Kunaver, Matevz
Tasic, Jurij
Pogacnik, Matevz
EUROCON 2007: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOLS 1-6, 2007, : 2086 - 2093
[25] Maximum likelihood estimation of missing data probability for nonmonotone missing at random data
Zhao, Yang
STATISTICAL METHODS AND APPLICATIONS, 2023, 32 (01): : 197 - 209
[26] Optimal pseudolikelihood estimation in the analysis of multivariate missing data with nonignorable nonresponse
Zhao, Jiwei
Ma, Yanyuan
BIOMETRIKA, 2018, 105 (02) : 479 - 486
[27] Estimation of missing data in analysis of covariance: A least-squares approach
Ogbonnaya, Chibueze E.
Uzochukwu, Emeka C.
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2016, 45 (07) : 1902 - 1909
[28] Analysis of three-dimensional grids: The estimation of two missing data
Silver, G. L.
APPLIED MATHEMATICS AND COMPUTATION, 2007, 184 (02) : 743 - 747
[29] Haplotype frequency estimation error analysis in the presence of missing genotype data
Enda D Kelly
Fabian Sievers
Ross McManus
BMC Bioinformatics, 5
[30] Maximum likelihood estimation of missing data probability for nonmonotone missing at random data
Yang Zhao
Statistical Methods & Applications, 2023, 32 : 197 - 209

← 1 2 3 4 5 →