Exploring incomplete data using visualization techniques

被引:71
作者
Templ, Matthias [1 ,2 ]
Alfons, Andreas [1 ,3 ]
Filzmoser, Peter [1 ]
机构
[1] Vienna Univ Technol, Dept Stat & Probabil Theory, A-1040 Vienna, Austria
[2] Stat Austria, Methods Unit, A-1110 Vienna, Austria
[3] Katholieke Univ Leuven, ORSTAT Res Ctr, Fac Business & Econ, B-3000 Louvain, Belgium
关键词
Visualization; Missing values; Exploring incomplete data; R software; IMPUTATION;
D O I
10.1007/s11634-011-0102-y
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Visualization of incomplete data allows to simultaneously explore the data and the structure of missing values. This is helpful for learning about the distribution of the incomplete information in the data, and to identify possible structures of the missing values and their relation to the available information. The main goal of this contribution is to stress the importance of exploring missing values using visualization methods and to present a collection of such visualization techniques for incomplete data, all of which are implemented in the the R package VIM. Providing such functionality for this widely used statistical environment, visualization of missing values, imputation and data analysis can all be done from within R without the need of additional software.
引用
收藏
页码:29 / 47
页数:19
相关论文
共 43 条
[1]  
Acuna E, 2009, MEMBERS CASTLE GROUP
[2]   SLEEP IN MAMMALS - ECOLOGICAL AND CONSTITUTIONAL CORRELATES [J].
ALLISON, T ;
CICCHETTI, DV .
SCIENCE, 1976, 194 (4266) :732-734
[3]  
Anderson T.W., 1986, STAT ANAL DATA, V2nd, DOI DOI 10.1007/978-94-009-4109-0
[4]  
[Anonymous], VIM VIS IMP MISS VAL
[5]  
[Anonymous], 2011, R: A Language and Environment for Statistical Computing
[6]  
[Anonymous], 2000, SURV METHODOL
[7]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[8]  
Cook D, 2007, USE R, P1, DOI 10.1007/978-0-387-71762-3
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]  
Eaton C, 2005, LECT NOTES COMPUT SC, V3585, P861, DOI 10.1007/11555261_68