DualBoost : Handling Missing Values with Feature Weights and Weak Classifiers that Abstain

被引:3
|
作者
Wang, Weihong [1 ]
Xu, Jie [1 ]
Wang, Yang [1 ]
Cai, Chen [1 ]
Chen, Fang [1 ]
机构
[1] CSIRO, Data61, Sydney, NSW, Australia
来源
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT | 2018年
关键词
Boosting; missing values; feature weights; weak classifiers that abstain;
D O I
10.1145/3269206.3269319
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values in real world datasets are a common issue. Handling missing values is one of the most key aspects in data mining, as it can seriously impact the performance of predictive models. In this paper we proposed a unified Boosting framework that consolidates model construction and missing value handling. At each Boosting iteration, weights are assigned to both the samples and features. The sample weights make difficult samples become the learning focus, while the feature weights enable critical features to be compensated by less critical features when they are unavailable. A weak classifier that abstains (i.e, produce no prediction when required feature value is missing) is learned on a data subset determined by the feature weights. Experimental results demonstrate the efficacy and robustness of the proposed method over existing Boosting algorithms.
引用
收藏
页码:1543 / 1546
页数:4
相关论文
共 50 条
  • [31] A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling missing attribute values: The good synergy between RBFNs and Event Covering method
    Luengo, Julian
    Garcia, Salvador
    Herrera, Francisco
    NEURAL NETWORKS, 2010, 23 (03) : 406 - 418
  • [32] A review and comparison of strategies for handling missing values in separate-and-conquer rule learning
    Wohlrab, Lars
    Fuernkranz, Johannes
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2011, 36 (01) : 73 - 98
  • [33] Handling missing values in population data:: consequences for maximum likelihood estimation of haplotype frequencies
    Gourraud, PA
    Génin, E
    Cambon-Thomsen, A
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2004, 12 (10) : 805 - 812
  • [34] A review and comparison of strategies for handling missing values in separate-and-conquer rule learning
    Lars Wohlrab
    Johannes Fürnkranz
    Journal of Intelligent Information Systems, 2011, 36 : 73 - 98
  • [35] Handling missing values in population data: consequences for maximum likelihood estimation of haplotype frequencies
    Pierre-Antoine Gourraud
    Emmanuelle Génin
    Anne Cambon-Thomsen
    European Journal of Human Genetics, 2004, 12 : 805 - 812
  • [36] A Multi Linear Regression Approach for Handling Missing Values with Unknown Dependent Variable (MLRMUD)
    Karama, Ahmed
    Farouk, Mona
    Atiya, Amir
    2018 14TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2018, : 195 - 201
  • [37] On Handling Missing Values in Data Stream Mining Algorithms Based on the Restricted Boltzmann Machine
    Jaworski, Maciej
    Duda, Piotr
    Rutkowska, Danuta
    Rutkowski, Leszek
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 347 - 354
  • [38] Handling of Missing Values in FCM Clustering-based ANFIS with Partial Distance Strategy
    Honda, Katsuhiro
    Hyakutake, Satoshi
    Ubukata, Seiki
    Notsu, Akira
    2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
  • [39] Interval multiplicative transitivity for consistency, missing values and priority weights of interval fuzzy preference relations
    Genc, Serkan
    Boran, Fatih Emre
    Akay, Diyar
    Xu, Zeshui
    INFORMATION SCIENCES, 2010, 180 (24) : 4877 - 4891
  • [40] Application of interval-valued aggregation to optimization problem of k - NN classifiers for missing values case
    Bentkowska, Urszula
    Bazan, Jan G.
    Rzasa, Wojciech
    Zareba, Lech
    INFORMATION SCIENCES, 2019, 486 : 434 - 449