A Study of High-Dimensional Data Imputation Using Additive LASSO Regression Model

被引:4
|
作者
Lavanya, K. [1 ]
Reddy, L. S. S. [2 ]
Reddy, B. Eswara [3 ]
机构
[1] JNTUA, Dept Comp Sci & Engn, Anantapur 515822, Andhra Pradesh, India
[2] KLU, Dept Comp Sci & Engn, Guntur 522502, Andhra Pradesh, India
[3] JNTUA, Dept Comp Sci, Anantapur 517234, Andhra Pradesh, India
关键词
High-dimensional data; Multiple imputations; Regression; Missing data; MULTIPLE IMPUTATION; MISSING-DATA; METAANALYSIS; HETEROGENEITY;
D O I
10.1007/978-981-10-8055-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid growth of computational domains, bioinformatics finance, engineering, biometrics, and neuroimaging emphasize the necessity for analyzing high-dimensional data. Many real-world datasets may contain hundreds or thousands of features. The common problem in most of the knowledge-based classification problems is quality and quantity of data. In general, the common problem with many high-dimensional data samples is that it contains missing or unknown attribute values, incomplete feature vectors, and uncertain or vague data which have to be handled carefully. Due to the presence of a large segment of missing values in the datasets, refined multiple imputation methods are required to estimate the missing values so that a fair and more consistent analysis can be achieved. In this paper, three imputation (MI) methods, mean, imputations predictive mean, and imputations by additive LASSO, are employed in cloud. Results show that imputations by additive LASSO are the preferred multiple imputation (MI) method.
引用
收藏
页码:19 / 30
页数:12
相关论文
共 50 条
  • [31] High-Dimensional Fused Lasso Regression Using Majorization-Minimization and Parallel Processing
    Yu, Donghyeon
    Won, Joong-Ho
    Lee, Taehoon
    Lim, Johan
    Yoon, Sungroh
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2015, 24 (01) : 121 - 153
  • [32] Comparison of Lasso Type Estimators for High-Dimensional Data
    Kim, Jaehee
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2014, 21 (04) : 349 - 361
  • [33] Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data
    Hou, Zhaohan
    Ma, Wei
    Wang, Lei
    TEST, 2023, 32 (04) : 1230 - 1250
  • [34] Lasso penalized semiparametric regression on high-dimensional recurrent event data via coordinate descent
    Wu, Tong Tong
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2013, 83 (06) : 1145 - 1155
  • [35] Hi-LASSO: High-Dimensional LASSO
    Kim, Youngsoon
    Hao, Jie
    Mallavarapu, Tejaswini
    Park, Joongyang
    Kang, Mingon
    IEEE ACCESS, 2019, 7 : 44562 - 44573
  • [36] Bootstrap-multiple-imputation; high-dimensional model validation with missing data
    Chang, Billy
    Demetrashvili, Nino
    Kowgier, Matthew
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2011, 39 (02): : 202 - 204
  • [37] Learning a Manifold Regression Model for Classifying High-dimensional Data
    Elkhoumri, A.
    Samir, C.
    Laassiri, J.
    9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 203 - 208
  • [38] Imputation of rounded zeros for high-dimensional compositional data
    Templ, Matthias
    Hron, Karel
    Filzmoser, Peter
    Gardlo, Alzbeta
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2016, 155 : 183 - 190
  • [39] Genetic Programming for Imputation Predictor Selection and Ranking in Symbolic Regression with High-Dimensional Incomplete Data
    Al-Helali, Baligh
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 523 - 535
  • [40] A high-dimensional additive nonparametric model
    Wu, Frank C. Z.
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2024, 166