Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques

被引:22
|
作者
Shafi, A. S. M. [1 ,2 ]
Molla, M. M. Imran [2 ]
Jui, Julakha Jahan [3 ]
Rahman, Mohammad Motiur [1 ]
机构
[1] Mawlana Bhashani Sci & Technol Univ, Dept Comp Sci & Engn, Tangail 1902, Bangladesh
[2] Khwaja Yunus Ali Univ, Fac Comp Sci & Engn, Sirajgonj 6751, Bangladesh
[3] Univ Malaysia Pahang, Fac Elect & Elect Engn, Pekan 26600, Pahang, Malaysia
来源
SN APPLIED SCIENCES | 2020年 / 2卷 / 07期
关键词
Colon cancer; Microarray data; Feature selection; Machine learning; Random forest; Cross validation; PARTICLE SWARM OPTIMIZATION; GENE; PREDICTION;
D O I
10.1007/s42452-020-3051-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Microarray data is an increasingly important tool for providing information on gene expression for analysis and interpretation. Researchers attempt to utilize the smallest possible set of relevant gene expression profiles in most gene expression studies to enhance tumor identification accuracy. This research aims to analyze and predicts colon cancer data employing a machine learning approach and feature selection technique based on a random forest classifier. More particularly, our proposed method can reduce the burden of high dimensional data and allow faster calculations by combining the "Mean Decrease Accuracy" and "Mean Decrease Gini" as feature selection methods into a renowned classifier namely Random Forest, with the aim of increasing the prediction model's accuracy level. In addition, we have also shown a comparative model analysis with selection of features and model without selection of features. The extensive experimental results have demonstrated that the proposed model with feature selection is favorable and effective which triumphs the best performance of accuracy.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Feature selection using differential evolution for microarray data classification
    Prajapati S.
    Das H.
    Gourisaria M.K.
    Discover Internet of Things, 2023, 3 (01):
  • [32] Utilizing Various Machine Learning Techniques for Diabetes Mellitus Feature Selection and Classification
    Sheta, Alaa
    Elashmawi, Walaa H.
    Al-Qerem, Ahmad
    Othman, Emad S.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (03) : 1372 - 1384
  • [33] Bystander Detection: Automatic Labeling Techniques using Feature Selection and Machine Learning
    Gupta, Anamika
    Thakkar, Khushboo
    Bhasin, Veenu
    Tiwari, Aman
    Mathur, Vibhor
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 1135 - 1143
  • [34] Classification of lung cancer using ensemble-based feature selection and machine learning methods
    Cai, Zhihua
    Xu, Dong
    Zhang, Qing
    Zhang, Jiexia
    Ngai, Sai-Ming
    Shao, Jianlin
    MOLECULAR BIOSYSTEMS, 2015, 11 (03) : 791 - 800
  • [35] Comparison of Multiple Feature Selection Techniques for Machine Learning-Based Detection of IoT Attacks
    Viet Anh Phan
    Jerabek, Jan
    Malina, Lukas
    19TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY, AND SECURITY, ARES 2024, 2024,
  • [36] Gene selection from microarray data for cancer classification - a machine learning approach
    Wang, Y
    Tetko, IV
    Hall, MA
    Frank, E
    Facius, A
    Mayer, KFX
    Mewes, HW
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2005, 29 (01) : 37 - 46
  • [37] Landslide susceptibility assessment using feature selection-based machine learning models
    Liu, Lei-Lei
    Yang, Can
    Wang, Xiao-Mi
    GEOMECHANICS AND ENGINEERING, 2021, 25 (01) : 1 - 16
  • [38] Machine learning for fake news classification with optimal feature selection
    Fayaz, Muhammad
    Khan, Atif
    Bilal, Muhammad
    Khan, Sana Ullah
    SOFT COMPUTING, 2022, 26 (16) : 7763 - 7771
  • [39] Machine learning for fake news classification with optimal feature selection
    Muhammad Fayaz
    Atif Khan
    Muhammad Bilal
    Sana Ullah Khan
    Soft Computing, 2022, 26 : 7763 - 7771
  • [40] Improving Classification Performance for Malware Detection Using Genetic Programming Feature Selection Techniques
    Harahsheh, Heba
    Alshraideh, Mohammad
    Al-Sharaeh, Saleh
    Al-Sayyed, Rizik
    JOURNAL OF APPLIED SECURITY RESEARCH, 2023, 18 (03) : 627 - 647