A hybrid Bayesian network and tensor factorization approach for missing value imputation to improve breast cancer recurrence prediction

被引:31
|
作者
Vazifehdan, Mahin [1 ]
Moattar, Mohammad Hossein [1 ]
Jalali, Mehrdad [1 ]
机构
[1] Islamic Azad Univ, Mashhad Branch, Dept Software Engn, Mashhad, Iran
关键词
Breast cancer recurrence; Missing value imputation; Classification; Tensor factorization; Bayesian network; MODEL; REGRESSION;
D O I
10.1016/j.jksuci.2018.01.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data mining and machine learning approaches can be used to predict breast cancer recurrence. However, real datasets often include missing values for various reasons. In this paper, a hybrid imputation method is proposed with respect to the dependency between the attributes and the type of incomplete attributes in order to especially improve the prediction of breast cancer recurrence. After splitting the dataset into two discrete and numerical subsets, first missing values of the discrete fields are imputed using Bayesian network. Then, using Tensor factorization, the integrated dataset, which comprises of the filled-subset of the previous stage and numerical missing values subset, is constructed so that both continuous missing values are imputed and the accuracy of imputation is enhanced. We evaluated the proposed method versus six imputation methods i.e. mean, Hot-deck, K-NN, Weighted K-NN, Tensor factorization and Bayesian network on three datasets and used three classifiers, namely decision tree, K-Nearest Neighbor and Support Vector Machine for recurrence prediction. Experimental results show that the proposed method has as average 0.26 prediction improvement. Also, the prediction performance of the proposed approach outperforms all other imputation-classifier pairs in terms of specificity, sensitivity and accuracy. (C) 2018 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
引用
收藏
页码:175 / 184
页数:10
相关论文
共 50 条
  • [2] Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model
    Chen, Xinyu
    He, Zhaocheng
    Chen, Yixian
    Lu, Yuhuan
    Wang, Jiawei
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2019, 104 : 66 - 77
  • [3] Missing value imputation for breast cancer diagnosis data using tensor factorization improved by enhanced reduced adaptive particle swarm optimization
    Nekouie, Atefeh
    Moattar, Mohammad Hossein
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2019, 31 (03) : 287 - 294
  • [4] Missing Value Imputation on Multiple Measurements for Prediction of Liver Cancer Recurrence: A Comparative Study
    Ping, Xiao-Ou
    Tseng, Yi-Ju
    Liang, Ja-Der
    Huang, Guan-Tarn
    Yang, Pei-Ming
    Lai, Feipei
    INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 1930 - 1939
  • [5] A hybrid imputation approach for microarray missing value estimation
    Huihui Li
    Changbo Zhao
    Fengfeng Shao
    Guo-Zheng Li
    Xiao Wang
    BMC Genomics, 16
  • [6] A hybrid imputation approach for microarray missing value estimation
    Li, Huihui
    Zhao, Changbo
    Shao, Fengfeng
    Li, Guo-Zheng
    Wang, Xiao
    BMC GENOMICS, 2015, 16
  • [7] Hybrid prediction model with missing value imputation for medical data
    Purwar, Archana
    Singh, Sandeep Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (13) : 5621 - 5631
  • [8] A missing value imputation method using a Bayesian network with weighted learning
    Miyakoshi, Yoshihiro
    Kato, Shohei
    ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2012, 95 (12) : 1 - 9
  • [9] Effective Bayesian-network-based missing value imputation enhanced by crowdsourcing
    Ye, Chen
    Wang, Hongzhi
    Lu, Wenbo
    Li, Jianzhong
    KNOWLEDGE-BASED SYSTEMS, 2020, 190
  • [10] An Auxiliary Approach to Prediction of Binary Outcome with Bayesian Network Model: Exploration with Data for Recurrence of Breast Cancer
    Ganapathy, Sachit
    Harichandrakumar, K. T.
    Tamilarasu, Kadhiravan
    Penumadu, Prasanth
    Nair, N. Sreekumaran
    JOURNAL OF CLINICAL AND DIAGNOSTIC RESEARCH, 2023, 17 (03) : YC6 - YC10