Application of active learning in DNA microarray data for cancerous gene identification

被引:13
作者
Begum, Shemim [1 ]
Sarkar, Ram [2 ]
Chakraborty, Debasis [3 ]
Sen, Sagnik [2 ]
Maulik, Ujjwal [2 ]
机构
[1] Govt Colege Engn & Text Technol, Comp Sci & Engn Dept, Murshidabad, W Bengal, India
[2] Jadavpur Univ, Comp Sci & Engn Dept, Kolkata, W Bengal, India
[3] Asansol Engn Coll, Comp Sci & Engn Dept, Asansol, W Bengal, India
关键词
Active learning; Biomarker; Cancer prediction; Microarray data; Symmetrical uncertainty; SVM; EXPRESSION; CLASSIFICATION; OPTIMIZATION; SELECTION;
D O I
10.1016/j.eswa.2021.114914
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Microarray technology has an important role in evaluating gene expression data with unique patterns into existence. In gene-expression based experiments, the expression level of the gene is constantly monitored in order to classify a tissue sample. In microarray technology, the expressions of the genes are altered with respect to pathogenes. The altered expression values can be identified by analyzing the genes of the tissue/cell that are affected along with the tissues/cells that are unaffected are termed as biomarkers. In the current paper, we have developed an Active Learning (AL) model by using Support Vector Machine (SVM) in association with featureselection (FS) algorithm; called Symmetrical Uncertainty (SU) for the prediction of cancer. The effectiveness of the proposed AL and SU combination is manifested and the biomarkers or cancerous genes identified by the proposed method on four gene-expression data sets are reported. In addition, the biological significance tests are performed for the cancer biomarkers obtained from the data sets.
引用
收藏
页数:8
相关论文
共 53 条
[1]  
Algamal ZY, 2017, ELECTRON J APPL STAT, V10, P242, DOI 10.1285/i20705948v10n1p242
[2]  
Ali S. I., 2012, INT J COMPUTER APPL, V60, P0975
[3]  
[Anonymous], 2014, IEEE J T ENG HLTH ME
[4]  
Antonio D., 2009, THESIS U PITTSBURGH
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]   Instance-based concept learning from multiclass DNA microarray data [J].
Berrar, D ;
Bradbury, I ;
Dubitzky, W .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]   Evaluation of the NOD/SCID xenograft model for glucocorticoid-regulated gene expression in childhood B-cell precursor acute lymphoblastic leukemia [J].
Bhadri, Vivek A. ;
Cowley, Mark J. ;
Kaplan, Warren ;
Trahair, Toby N. ;
Lock, Richard B. .
BMC GENOMICS, 2011, 12
[8]  
Burges C. J. C., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P71
[9]  
Chapelle O, 2008, J MACH LEARN RES, V9, P203
[10]  
Chen A. H., 2010, PMCID