Design of an imputation methodology by random selection using regression trees

被引:0
作者
Useche, Lelly [1 ]
Perez Parra, Jean [2 ]
Garcia-Mendoza, Carlos [1 ]
Ides Chacon, Ana [1 ]
机构
[1] UTM, Dept Matemat & Estadist, Inst Ciencias Basicas, Portoviejo, Ecuador
[2] UTM, Dept Quim, Inst Ciencias Basicas, Portoviejo, Ecuador
来源
BULLETIN OF COMPUTATIONAL APPLIED MATHEMATICS | 2021年 / 9卷 / 02期
关键词
Absence of data; imputation; regression trees; random selection;
D O I
暂无
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
One of the biggest issues in the information collection stage is the absence of data, this research focuses specifically on the scenario when the loss is partial, completely random and the data is quantitative. There are classic techniques to impute data, however, these have not been able to accurately impute the real data. A design of an imputation methodology by random selection is proposed through the use of regression trees, comparing theoretically and empirically with and without the use of the tree for different data loss percentages. Unbiased estimators of variances and biases are obtained by evaluating their properties, which improves the estimates. As a disadvantage of the proposed design, it does not solve the alteration of the distribution of the data and the relationship between the variables.
引用
收藏
页码:97 / 121
页数:25
相关论文
共 22 条
  • [1] Breiman L., 1984, STAT PROBABILITY SER, DOI 10.1201/9781315139470
  • [2] Bussi J., 2018, Vigesimoter eras Jornadas Investiga iones en la Fa ultad de Cien ias E onomi as y Estadisti a, V133
  • [3] Chambers R., 2001, The AUTIMP-proje t: Evaluation of WAID, P1
  • [4] Factorization of posteriors and partial imputation algorithm for graphical models with missing data
    Geng, Z
    Li, KC
    [J]. STATISTICS & PROBABILITY LETTERS, 2003, 64 (04) : 369 - 379
  • [5] PROPOSAL FOR HANDLING MISSING DATA
    GLEASON, TC
    STAELIN, R
    [J]. PSYCHOMETRIKA, 1975, 40 (02) : 229 - 252
  • [6] SICE: an improved missing data imputation technique
    Khan, Shahidul Islam
    Hoque, Abu Sayed Md Latiful
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [7] Imputation using response probability
    Kim, JK
    Park, H
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2006, 34 (01): : 171 - 182
  • [8] Fractional hot deck imputation
    Kim, JK
    Fuller, W
    [J]. BIOMETRIKA, 2004, 91 (03) : 559 - 578
  • [9] Koikkalainen P., 2002, DATACLEAN 2002 C JYV, P1013
  • [10] Martinez D., 2018, Master's thesis