Classification of Multi-class Microarray Cancer Data Using Ensemble Learning Method

被引:2
|
作者
Shekar, B. H. [1 ]
Dagnew, Guesh [1 ]
机构
[1] Mangalore Univ, Dept Comp Sci, Mangalore, Karnataka, India
来源
DATA ANALYTICS AND LEARNING | 2019年 / 43卷
关键词
Feature selection; Dimensionality reduction; Ensemble learning; Microarray cancer data classifier; FEATURE-SELECTION; GENE SELECTION;
D O I
10.1007/978-981-13-2514-4_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, microarray cancer analysis is one of the top research areas in the field of machine learning, computational biology, and pattern recognition. Classifying cancer data into their respective class and its analysis plays a key role in diagnosis, identifying negative and positive cases as well as treatment in the case of binary classes. In the case of multi-class classification, the aim is to identify the type of cancer. The main challenge in microarray cancer datasets is the curse of dimensionality and lack of sufficient sample data. To overcome this problem, feature selection and dimensionality reduction are explored in identifying relevant features. In this work, we propose an ensemble learning method for multi-class cancer data classification. The Information Gain (IG) is used for feature selection which works by ranking attributes according to their relevance with respect to the class label. Three classifiers are used, namely k-Nearest Neighbor, Logistic Regression, and Random Forest. tenfold cross validation is applied to train and test the model. Experiments are conducted on the standard multi-class cancer datasets, namely Leukemia 3 class, Leukemia 4 class, Harvard Lung cancer 5 class, and MLL 3 class. To evaluate the performance of the model, various performance measures such as Classification Accuracy, F1-measure, and Area Under the Curve (AUC) are used. Confusion matrix is used to show whether or not samples are correctly classified. Comparison of each classifier's performance is presented on the basis of performance evaluation criteria. Significant performance improvement is observed in the results due to feature selection for three of the classifiers with the exception of random forest's performance on MLL Leukemia whose result is found to be good on the original dataset compared to the selected features. For the rest of the datasets, all classifiers registered better result due to feature selection.
引用
收藏
页码:279 / 292
页数:14
相关论文
共 50 条
  • [1] A Hierarchical Ensemble of ECOC for cancer classification based on multi-class microarray data
    Liu, Kun-Hong
    Zeng, Zhi-Hao
    Ng, Vincent To Yee
    INFORMATION SCIENCES, 2016, 349 : 102 - 118
  • [2] Multi-Class Breast Cancer Classification Using Ensemble of Pretrained models and Transfer Learning
    Rao, Perumalla Murali Mallikarjuna
    Singh, Sanjay Kumar
    Khamparia, Aditya
    Bhushan, Bharat
    Podder, Prajoy
    CURRENT MEDICAL IMAGING, 2022, 18 (04) : 409 - 416
  • [3] An Effective Ensemble Method for Multi-class Classification and Regression for Imbalanced Data
    Alam, Tahira
    Ahmed, Chowdhury Farhan
    Zahin, Sabit Anwar
    Khan, Muhammad Asif Hossain
    Islam, Maliha Tashfia
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 59 - 74
  • [4] Multi-Class Text Classification on Khmer News Using Ensemble Method in Machine Learning Algorithms
    Phann, Raksmey
    Soomlek, Chitsutha
    Seresangtakul, Pusadee
    ACTA INFORMATICA PRAGENSIA, 2023, 12 (02) : 243 - 259
  • [5] A Study on Multi-class Classification of Breast Cancer Images using Ensemble Network and Transfer Learning
    Tipirneni L.
    Patan R.
    Recent Patents on Engineering, 2021, 15 (06)
  • [6] Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data
    Tan, YX
    Shi, LM
    Tong, WD
    Wang, C
    NUCLEIC ACIDS RESEARCH, 2005, 33 (01) : 56 - 65
  • [7] Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments
    Tian, Suyan
    Suarez-Farinas, Mayte
    PLOS ONE, 2013, 8 (11):
  • [8] Multi-class classification of breast cancer abnormality using transfer learning
    Rani, Neha
    Gupta, Deepak Kumar
    Singh, Samayveer
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 75085 - 75100
  • [9] Multi-class Ensemble Learning of Imbalanced Bidding Fraud Data
    Anowar, Farzana
    Sadaoui, Samira
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11489 : 352 - 358
  • [10] Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data
    Zhao, Jiakun
    Jin, Ju
    Zhang, Yibo
    Zhang, Ruifeng
    Chen, Si
    INTELLIGENT DATA ANALYSIS, 2022, 26 (03) : 599 - 614