Breast Cancer Prediction: Importance of Feature Selection

被引:0
作者
Prateek [1 ]
机构
[1] QR 1012,SECT 4-C, Bokaro Steel City, Jharkhand, India
来源
ADVANCES IN COMPUTER COMMUNICATION AND COMPUTATIONAL SCIENCES, IC4S 2018 | 2019年 / 924卷
关键词
Machine learning; KNN; Feature selection; SVM; Logistic regression; Naive Bayes; Classification; Prediction algorithms; Breast cancer; CLASSIFICATION RULES; DIAGNOSIS;
D O I
10.1007/978-981-13-6861-5_62
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In today's world, breast cancer is one of the most widespread causes of death in women. According to an estimation, approximately 40,920 women would die in 2018 just because of breast cancer, which is a highly alarming number. Such alarming numbers could be reduced if the cancer is diagnosed at an early stage. With the advent of technology, making such predictions has become an easier task. Machine learning is one of the latest trends, which enables to make predictions related to diseases based on physical or behavioral characteristics. In this paper, we use various machine learning algorithms like decision trees, k-nearest neighbor (KNN), logistic regression, neural networks (NNs), naive Bayes, random forest, and support vector machine (SVM). The outcome is then compared based on the precision, recall, and F1 score. Furthermore, we identify the least important features in the dataset, implement all these algorithms again after removing those features, and then compare the outcomes for the two implementation stages in order to understand the importance of feature selection in breast cancer prediction.
引用
收藏
页码:733 / 742
页数:10
相关论文
共 15 条
[1]  
Abdelghani B., 2006, Ninth Workshop on Mining Scientific and Engineering Datasets in conjunction with the Sixth SIAM International, V58, P10
[2]  
[Anonymous], 2012, ARXIV12051923
[3]   BREAST-CANCER - PREDICTION WITH ARTIFICIAL NEURAL-NETWORK-BASED ON BI-RADS STANDARDIZED LEXICON [J].
BAKER, JA ;
KORNGUTH, PJ ;
LO, JY ;
WILLIFORD, ME ;
FLOYD, CE .
RADIOLOGY, 1995, 196 (03) :817-822
[4]  
Chaurasia Vikas., 2017, Data mining techniques: To predict and resolve Breast Cancer survivability
[5]   Predicting breast cancer survivability: a comparison of three data mining methods [J].
Delen, D ;
Walker, G ;
Kadam, A .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2005, 34 (02) :113-127
[6]  
Gupta S., 2011, Indian Journal of Computer Science and Engineering, V2, P188
[7]  
Hassanien AE, 2004, INFORMATICA-LITHUAN, V15, P23
[8]   An expert system for detection of breast cancer based on association rules and neural network [J].
Karabatak, Murat ;
Ince, M. Cevdet .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :3465-3469
[9]   Data mining with decision trees for diagnosis of breast tumor in medical ultrasonic images [J].
Kuo, WJ ;
Chang, RF ;
Chen, DR ;
Lee, CC .
BREAST CANCER RESEARCH AND TREATMENT, 2001, 66 (01) :51-57
[10]   Breast cancer diagnosis using least square support vector machine [J].
Polat, Kemal ;
Guenes, Salih .
DIGITAL SIGNAL PROCESSING, 2007, 17 (04) :694-701