Automatic detection of lung cancer from biomedical data set using discrete AdaBoost optimized ensemble learning generalized neural networks

被引:85
作者
Shakeel, P. Mohamed [1 ]
Tolba, Amr [2 ,3 ]
Al-Makhadmeh, Zafer [2 ]
Jaber, Mustafa Musa [4 ]
机构
[1] Univ Tekn Malaysia Melaka, Fac Informat & Commun Technol, Durian Tunggal, Malaysia
[2] King Saud Univ, Community Coll, Dept Comp Sci, Riyadh, Saudi Arabia
[3] Menoufia Univ, Fac Sci, Math & Comp Sci Dept, Shibin Al Kawm, Egypt
[4] Dijlah Univ Coll, Dept Comp Sci, Baghdad, Iraq
关键词
Computer-aided diagnosis; Neural computing; Biomedical; ELVIRA Biomedical Data Set Repository; Minimum repetition and Wolf heuristic features; Discrete AdaBoost optimized ensemble learning generalized neural networks; MUTUAL INFORMATION; CLASSIFICATION; DIAGNOSIS; SELECTION; SEGMENTATION; FOREST;
D O I
10.1007/s00521-018-03972-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, most of the people are affected by lung cancer, mainly because of the genetic changes of the tissues in the lungs. Other factors such as smoking, alcohol, and exposure to dangerous gases can also be considered the contributory causes of lung cancer. Due to the serious consequences of lung cancer, the medical associations have been striving to diagnose cancer in its early stage of growth by applying the computer-aided diagnosis process. Although the CAD system at healthcare centers is able to diagnose lung cancer during its early stage of growth, the accuracy of cancer detection is difficult to achieve, mainly because of the overfitting of lung cancer features and the dimensionality of the feature set. Thus, this paper introduces the effective and optimized neural computing and soft computing techniques to minimize the difficulties and issues in the feature set. Initially, lung biomedical data were collected from the ELVIRA Biomedical Data Set Repository. The noise present in the data was eliminated by applying the bin smoothing normalization process. The minimum repetition and Wolf heuristic features were subsequently selected to minimize the dimensionality and complexity of the features. The selected lung features were analyzed using discrete AdaBoost optimized ensemble learning generalized neural networks, which successfully analyzed the biomedical lung data and classified the normal and abnormal features with great effectiveness. The efficiency of the system was then evaluated using MATLAB experimental setup in terms of error rate, precision, recall, G-mean, F-measure, and prediction rate.
引用
收藏
页码:777 / 790
页数:14
相关论文
共 42 条
[1]  
[Anonymous], 2009, TECHNICAL REPORT
[2]  
[Anonymous], LUNG CANC PAT VERS
[3]  
[Anonymous], 2018, INT J APPL ENG RES
[4]  
[Anonymous], 2014, IISA 2014 5 INT C IN
[5]  
[Anonymous], 2016, MURRAY NADELS TXB RE
[6]  
[Anonymous], 2014, World Cancer Report
[7]   Gene-expression profiles predict survival of patients with lung adenocarcinoma [J].
Beer, DG ;
Kardia, SLR ;
Huang, CC ;
Giordano, TJ ;
Levin, AM ;
Misek, DE ;
Lin, L ;
Chen, GA ;
Gharib, TG ;
Thomas, DG ;
Lizyness, ML ;
Kuick, R ;
Hayasaka, S ;
Taylor, JMG ;
Iannettoni, MD ;
Orringer, MB ;
Hanash, S .
NATURE MEDICINE, 2002, 8 (08) :816-824
[8]   Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses [J].
Bhattacharjee, A ;
Richards, WG ;
Staunton, J ;
Li, C ;
Monti, S ;
Vasa, P ;
Ladd, C ;
Beheshti, J ;
Bueno, R ;
Gillette, M ;
Loda, M ;
Weber, G ;
Mark, EJ ;
Lander, ES ;
Wong, W ;
Johnson, BE ;
Golub, TR ;
Sugarbaker, DJ ;
Meyerson, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (24) :13790-13795
[9]  
Claypo Niphat, 2014, 2014 International Computer Science and Engineering Conference (ICSEC), P394, DOI 10.1109/ICSEC.2014.6978229
[10]  
Collins LG, 2007, AM FAM PHYSICIAN, V75, P56