DualBoost : Handling Missing Values with Feature Weights and Weak Classifiers that Abstain

被引：3

作者：

Wang, Weihong ^{[1
]}

Xu, Jie ^{[1
]}

Wang, Yang ^{[1
]}

Cai, Chen ^{[1
]}

Chen, Fang ^{[1
]}

机构：

[1] CSIRO, Data61, Sydney, NSW, Australia

来源：

CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT | 2018年

关键词：

Boosting; missing values; feature weights; weak classifiers that abstain;

D O I：

10.1145/3269206.3269319

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Missing values in real world datasets are a common issue. Handling missing values is one of the most key aspects in data mining, as it can seriously impact the performance of predictive models. In this paper we proposed a unified Boosting framework that consolidates model construction and missing value handling. At each Boosting iteration, weights are assigned to both the samples and features. The sample weights make difficult samples become the learning focus, while the feature weights enable critical features to be compensated by less critical features when they are unavailable. A weak classifier that abstains (i.e, produce no prediction when required feature value is missing) is learned on a data subset determined by the feature weights. Experimental results demonstrate the efficacy and robustness of the proposed method over existing Boosting algorithms.

引用

页码：1543 / 1546

页数：4

共 50 条

[41] A novel clustering-based purity and distance imputation for handling medical data with missing values
Cheng, Ching-Hsue
Huang, Shu-Fen
SOFT COMPUTING, 2021, 25 (17) : 11781 - 11801
[42] An Improved Method of Handling Missing Values in the Analysis of Sample Entropy for Continuous Monitoring of Physiological Signals
Dong, Xinzheng
Chen, Chang
Geng, Qingshan
Cao, Zhixin
Chen, Xiaoyan
Lin, Jinxiang
Jin, Yu
Zhang, Zhaozhi
Shi, Yan
Zhang, Xiaohua Douglas
ENTROPY, 2019, 21 (03):
[43] A novel clustering-based purity and distance imputation for handling medical data with missing values
Ching-Hsue Cheng
Shu-Fen Huang
Soft Computing, 2021, 25 : 11781 - 11801
[44] Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review
Afkanpour, Marziyeh
Hosseinzadeh, Elham
Tabesh, Hamed
BMC MEDICAL RESEARCH METHODOLOGY, 2024, 24 (01)
[45] Evaluation Techniques for Long Short-Term Memory Models: Overfitting Analysis and Handling Missing Values
Bolboaca, Roland
Haller, Piroska
Genge, Bela
ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, IEA-AIE 2024, 2024, 14748 : 228 - 240
[46] PE_DIM: An Efficient Probabilistic Ensemble Classification Algorithm for Diabets Handling Class Imbalance Missing Values
Jia, Liyan
Wang, Zhiping
Lv, Siqi
Xu, Zhaohui
IEEE ACCESS, 2022, 10 : 107459 - 107476
[47] Handling missing values and imbalanced classes in machine learning to predict consumer preference: Demonstrations and comparisons to prominent methods
Liu, Yahui
Li, Bin
Yang, Shuai
Li, Zhen
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[48] A novel approach for incremental uncertainty rule generation from databases with missing values handling: Application to dynamic medical databases
Konias, S
Chouvarda, I
Vlahavas, I
Maglaveras, N
MEDICAL INFORMATICS AND THE INTERNET IN MEDICINE, 2005, 30 (03): : 211 - 225
[49] Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling
Johan H. L. Oud
Manuel C. Voelkle
Quality & Quantity, 2014, 48 : 3271 - 3288
[50] Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling
Oud, Johan H. L.
Voelkle, Manuel C.
QUALITY & QUANTITY, 2014, 48 (06) : 3271 - 3288

← 1 2 3 4 5 →