Data variability in the imputation quality of missing data

被引:1
作者
Stochero, Elisandra Lucia Moro [1 ]
Lucio, Alessandro Dal'Col [2 ]
Jacobi, Luciane Flores [3 ]
机构
[1] Prefeitura Municipal Santa Maria, Secretaria Educ, Rua Alameda Montevideo 313,Edincio Sobral Pinto 1, BR-97010004 Santa Maria, RS, Brazil
[2] Univ Fed Santa Maria, Dept Fitotecnia, Santa Maria, RS, Brazil
[3] Univ Fed Santa Maria, Dept Estat, Santa Maria, RS, Brazil
关键词
missing data; data imputation; randomized block design; distribution-free multiple imputation; MULTIPLE IMPUTATION;
D O I
10.4025/actasciagron.v46i1.66185
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
Imputation methods were developed to define estimates for missing data and hence solve possible problems generated by the loss of this information. This study aims to assess whether data variability influences the results obtained after applying an imputation method. Incomplete databases were generated from complete real databases of experiments of tomato plants conducted using the randomized block design with three replications and 12 treatments by removing different amounts of data. The evaluated variables consisted of fruit weight per plant, number of fruits per plant, and average fruit length and width, forming eight balanced databases. Subsequently, the distribution -free multiple imputation method was applied, generating complete databases from imputation. The number of missing information influenced the accuracy measures for the data in this study. Data imputation was inadequate when there was high variability but more precise and accurate in cases of low variability. It confirmed the importance of assessing data variability before choosing to apply the imputation method.
引用
收藏
页数:8
相关论文
共 28 条
[1]  
[Anonymous], 2009, RStudio: Integrated Development for R
[2]   Missing Data in Clinical Research: A Tutorial on Multiple Imputation [J].
Austin, Peter C. ;
White, Ian R. ;
Lee, Douglas S. ;
van Buuren, Stef .
CANADIAN JOURNAL OF CARDIOLOGY, 2021, 37 (09) :1322-1331
[3]  
Banzatto D.A., 2013, Experimentacao agricola, V4
[4]   Distribution-free multiple imputation in an interaction matrix through singular value decomposition [J].
Bergamo, Genevile Carife ;
dos Santos Dias, Carlos Tadeu ;
Krzanowski, Wojtek Janusz .
SCIENTIA AGRICOLA, 2008, 65 (04) :422-427
[5]  
Bleidorn Michel Trarbach, 2022, Rev. Ambient. Água, V17, pe2795, DOI 10.4136/ambi-agua.2795
[6]   Machine learning imputation of missing Mesonet temperature observations [J].
Boomgard-Zagrodnik, Joseph P. ;
Brown, David J. .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 192
[7]  
Enders CK., 2010, APPL MISSING DATA AN
[8]  
Eze F. C., 2019, International Journal of Trend in Scientific Research and Development, V3, P994, DOI [10.31142/ijtsrd18599, DOI 10.31142/IJTSRD18599]
[9]  
Filgueira FAR., 2008, Novo Manual de Olericultura: agrotecnologia moderna da producao e comercializacao de hortalicas [New Olericulture Manual: modern agrotechnology for the production and commercialization of vegetables]
[10]  
Gomes F.P., 1985, Curso de estatistica experimental