(CDRGI)-Cancer detection through relevant genes identification

被引:3
作者
Al-Obeidat, Feras [1 ]
Rocha, Alvaro [2 ]
Akram, Maryam [3 ]
Razzaq, Saad [3 ]
Maqbool, Fahad [3 ]
机构
[1] Zayed Univ, Coll Technol Innovat, Abu Dhabi, U Arab Emirates
[2] Univ Lisbon, ISEG Lisbon Sch Econ & Management, Lisbon, Portugal
[3] Univ Sargodha, Dept Comp Sci & IT, Sargodha, Pakistan
关键词
Support vector machine; Cascading classifier; Discrete filtering; Artificial bee colony; Gene expression; CatBoost classifier; Convolutional neural network; EXPRESSION; CLASSIFICATION; ALGORITHM; MACHINE;
D O I
10.1007/s00521-021-05739-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer is a genetic disease that is categorized among the most lethal and belligerent diseases. An early staging of the disease can reduce the high mortality rate associated with cancer. The advancement in high throughput sequencing technology and the implementation of several Machine Learning algorithms have led to significant progress in Oncogenomics over the past few decades. Oncogenomics uses RNA sequencing and gene expression profiling for the identification of cancer-related genes. The high dimensionality of RNA sequencing data makes it a complex and large-scale optimization problem. CDRGI presents a Discrete Filtering technique based on a Binary Artificial Bee Colony coupling Support Vector Machine and a two-stage cascading classifier to identify relevant genes and detect cancer using RNA seq data. The proposed approach has been tested for seven different cancers, including Breast Cancer, Stomach Cancer (STAD), Colon Cancer (COAD), Liver Cancer, Lung Cancer (LUSC), Kidney Cancer (KIRC), and Skin Cancer. The results revealed that the CDRGI performs better for feature reduction while achieving better classification accuracy for STAD, COAD, LUSC and KIRC cancer types.
引用
收藏
页码:8447 / 8454
页数:8
相关论文
共 25 条
[1]   A modified Artificial Bee Colony algorithm for real-parameter optimization [J].
Akay, Bahriye ;
Karaboga, Dervis .
INFORMATION SCIENCES, 2012, 192 :120-142
[2]  
[Anonymous], UNDERSTANDING MULTIL
[3]  
[Anonymous], 2018, Cancer - World Health Organization
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]  
Danaee Padideh, 2017, Pac Symp Biocomput, V22, P219, DOI 10.1142/9789813207813_0022
[7]   Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckcio optimization algorithm and harmony search for cancer classification [J].
Elyasigomari, V. ;
Lee, D. A. ;
Screen, H. R. C. ;
Shaheed, M. H. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 67 :11-20
[8]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[9]   Prediction of tumor location in prostate cancer tissue using a machine learning system on gene expression data [J].
Hamzeh, Osama ;
Alkhateeb, Abedalrhman ;
Zheng, Julia ;
Kandalam, Srinath ;
Rueda, Luis .
BMC BIOINFORMATICS, 2020, 21 (Suppl 2)
[10]  
Hsu YH, 2018, IEEE ENG MED BIO, P5374, DOI 10.1109/EMBC.2018.8513521