DualBoost : Handling Missing Values with Feature Weights and Weak Classifiers that Abstain

被引:3
|
作者
Wang, Weihong [1 ]
Xu, Jie [1 ]
Wang, Yang [1 ]
Cai, Chen [1 ]
Chen, Fang [1 ]
机构
[1] CSIRO, Data61, Sydney, NSW, Australia
来源
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT | 2018年
关键词
Boosting; missing values; feature weights; weak classifiers that abstain;
D O I
10.1145/3269206.3269319
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values in real world datasets are a common issue. Handling missing values is one of the most key aspects in data mining, as it can seriously impact the performance of predictive models. In this paper we proposed a unified Boosting framework that consolidates model construction and missing value handling. At each Boosting iteration, weights are assigned to both the samples and features. The sample weights make difficult samples become the learning focus, while the feature weights enable critical features to be compensated by less critical features when they are unavailable. A weak classifier that abstains (i.e, produce no prediction when required feature value is missing) is learned on a data subset determined by the feature weights. Experimental results demonstrate the efficacy and robustness of the proposed method over existing Boosting algorithms.
引用
收藏
页码:1543 / 1546
页数:4
相关论文
共 50 条
  • [41] A novel clustering-based purity and distance imputation for handling medical data with missing values
    Cheng, Ching-Hsue
    Huang, Shu-Fen
    SOFT COMPUTING, 2021, 25 (17) : 11781 - 11801
  • [42] An Improved Method of Handling Missing Values in the Analysis of Sample Entropy for Continuous Monitoring of Physiological Signals
    Dong, Xinzheng
    Chen, Chang
    Geng, Qingshan
    Cao, Zhixin
    Chen, Xiaoyan
    Lin, Jinxiang
    Jin, Yu
    Zhang, Zhaozhi
    Shi, Yan
    Zhang, Xiaohua Douglas
    ENTROPY, 2019, 21 (03):
  • [43] A novel clustering-based purity and distance imputation for handling medical data with missing values
    Ching-Hsue Cheng
    Shu-Fen Huang
    Soft Computing, 2021, 25 : 11781 - 11801
  • [44] Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review
    Afkanpour, Marziyeh
    Hosseinzadeh, Elham
    Tabesh, Hamed
    BMC MEDICAL RESEARCH METHODOLOGY, 2024, 24 (01)
  • [45] Evaluation Techniques for Long Short-Term Memory Models: Overfitting Analysis and Handling Missing Values
    Bolboaca, Roland
    Haller, Piroska
    Genge, Bela
    ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, IEA-AIE 2024, 2024, 14748 : 228 - 240
  • [46] PE_DIM: An Efficient Probabilistic Ensemble Classification Algorithm for Diabets Handling Class Imbalance Missing Values
    Jia, Liyan
    Wang, Zhiping
    Lv, Siqi
    Xu, Zhaohui
    IEEE ACCESS, 2022, 10 : 107459 - 107476
  • [47] Handling missing values and imbalanced classes in machine learning to predict consumer preference: Demonstrations and comparisons to prominent methods
    Liu, Yahui
    Li, Bin
    Yang, Shuai
    Li, Zhen
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [48] A novel approach for incremental uncertainty rule generation from databases with missing values handling: Application to dynamic medical databases
    Konias, S
    Chouvarda, I
    Vlahavas, I
    Maglaveras, N
    MEDICAL INFORMATICS AND THE INTERNET IN MEDICINE, 2005, 30 (03): : 211 - 225
  • [49] Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling
    Johan H. L. Oud
    Manuel C. Voelkle
    Quality & Quantity, 2014, 48 : 3271 - 3288
  • [50] Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling
    Oud, Johan H. L.
    Voelkle, Manuel C.
    QUALITY & QUANTITY, 2014, 48 (06) : 3271 - 3288