Classification of breast cancer using microarray gene expression data: A survey

被引:42
作者
Abd-Elnaby, Muhammed [1 ]
Alfonse, Marco [1 ]
Roushdy, Mohamed [2 ]
机构
[1] Ain Shams Univ, Fac Comp & Informat Sci, Cairo, Egypt
[2] Future Univ, Fac Comp & Informat Technol, New Cairo, Egypt
关键词
Feature selection; Machine learning; Cancer classification; Microarray data;
D O I
10.1016/j.jbi.2021.103764
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cancer, in particular breast cancer, is considered one of the most common causes of death worldwide according to the world health organization. For this reason, extensive research efforts have been done in the area of accurate and early diagnosis of cancer in order to increase the likelihood of cure. Among the available tools for diagnosing cancer, microarray technology has been proven to be effective. Microarray technology analyzes the expression level of thousands of genes simultaneously. Although the huge number of features or genes in the microarray data may seem advantageous, many of these features are irrelevant or redundant resulting in the deterioration of classification accuracy. To overcome this challenge, feature selection techniques are a mandatory preprocessing step before the classification process. In the paper, the main feature selection and classification techniques introduced in the literature for cancer (particularly breast cancer) are reviewed to improve the microarray-based classification.
引用
收藏
页数:9
相关论文
共 55 条
[1]  
Bray F., Ferlay J., Soerjomataram I., Siegel R.L., Torre L.A., Jemal A., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: a cancer journal for clinicians, 68, pp. 394-424, (2018)
[2]  
Eliyatkin N., Yalcin E., Zengel B., Aktas S., Vardar E., Molecular classification of breast carcinoma: from traditional, old- fashioned way to a new age, and a new way, The journal of breast health, 11, (2015)
[3]  
Torre L.A., Bray F., Siegel R.L., Ferlay J., Lortet-Tieulent J., Jemal A., Global cancer statistics, 2012: Global Cancer Statistics, 2012, CA: A Cancer Journal for Clinicians, 65, 2, pp. 87-108, (2015)
[4]  
Priya R., Vadivu P.S., A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence, International Journal of Engineering and Management Research (IJEMR), 9, pp. 142-146, (2019)
[5]  
Purbolaksono M.D., Widiastuti K.C., Mubarok M.S., Ma'ruf F.A., Implementation of mutual information and bayes theorem for classification microarray data, Journal of Physics: Conference Series, IOP Publishing, (2018)
[6]  
Makary M.A., Daniel M., Medical error—the third leading cause of death in the US, Bmj, 353, (2016)
[7]  
Hong H.J., Koom W.S., Koh W.-G., Cell microarray technologies for high-throughput cell-based biosensors, Sensors, 17, (2017)
[8]  
Cilia N.D., De Stefano C., Fontanella F., Raimondo S., Scotto di Freca A., An experimental comparison of feature- selection and classification methods for microarray datasets, Information, 10, (2019)
[9]  
Yu Z., Chen H., You J., Wong H.-S., Liu J., Li L., Han G., Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles, IEEE/ACM transactions on computational biology and bioinformatics, 11, pp. 727-740, (2014)
[10]  
Kourou K., Exarchos T.P., Exarchos K.P., Karamouzis M.V., Fotiadis D.I., Machine learning applications in cancer prognosis and prediction, Computational and structural biotechnology journal, 13, pp. 8-17, (2015)