Stable feature selection based on probability estimation in gene expression datasets

被引:2
|
作者
Ahmadi, Melika [1 ]
Mahmoodian, Hamid [1 ,2 ]
机构
[1] Islamic Azad Univ, Dept Elect Engn, Najafabad Branch, Najafabad, Iran
[2] Islamic Azad Univ, Digital Proc & Machine Vis Res Ctr, Najafabad Branch, Najafabad, Iran
关键词
Feature selection; Stability; Probability estimation; LOGISTIC-REGRESSION; PHASE-DIAGRAM; CLASSIFICATION; STABILITY; ALGORITHM; ROBUST;
D O I
10.1016/j.eswa.2024.123372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge discovery from big datasets is one of the most important challenges in the pattern recognition field. More important than this is how much the extracted information and created models are reliable. Studies have shown that these models are usually highly dependent on the samples, features, data, and structure of the models. In general, the issue of stability is very important in creating models. This paper presents a method that not only considers the effect of different known classifiers but also tries to achieve a stable model for separating samples by combining different feature selection methods and considering the criterion of stability. Briefly, our contributions to the proposed method include 1) analyzing the ability of features in sample classification individually with different well-known classifiers, 2) estimating the probability of the features that could be selected in high-impact sets of features, and 3) applying the stability concept to increase the weight of the robust sets of the features. The proposed algorithm is used to select the high-impact genes of microarray datasets. Three highdimensional gene expressions of cancerous tissues are used as benchmarks. The results obtained show relative superiority compared to other methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Exploring the Stability of Feature Selection Methods across a Palette of Gene Expression Datasets
    Mungloo-Dilmohamud, Zahra
    Jaufeerally-Fakim, Yasmina
    Pena-Reyes, Carlos
    ICBBE 2019: 2019 6TH INTERNATIONAL CONFERENCE ON BIOMEDICAL AND BIOINFORMATICS ENGINEERING, 2019, : 7 - 12
  • [2] Supervised feature selection on gene expression microarray datasets using manifold learning
    Zare, Masoumeh
    Azizizadeh, Najmeh
    Kazemipour, Ali
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2023, 237
  • [3] A model-based relevance estimation approach for feature selection in microarray datasets
    Bontempi, Gianluca
    Meyer, Patrick E.
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT II, 2008, 5164 : 21 - 31
  • [4] Feature Selection Method AUC-Based with Estimation Probability and Smoothing
    Ribeiro, Guilherme
    Goncalves, Cristhiane
    dos Santos, Paulo Victor
    Barbosa, Rommel Melgaco
    2021 7TH INTERNATIONAL CONFERENCE ON ENGINEERING AND EMERGING TECHNOLOGIES (ICEET 2021), 2021, : 127 - 134
  • [5] Stability of Feature Selection Methods: A Study of Metrics Across Different Gene Expression Datasets
    Mungloo-Dilmohamud, Zahra
    Jaufeerally-Fakim, Yasmina
    Pena-Reyes, Carlos
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2020), 2020, 12108 : 659 - 669
  • [6] The use of gene expression datasets in feature selection research: 20 years of inherent bias?
    Grisci, Bruno I.
    Feltes, Bruno Cesar
    Poloni, Joice de Faria
    Narloch, Pedro H.
    Dorn, Marcio
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 14 (02)
  • [7] Fall Risk Probability Estimation Based On Supervised Feature Learning Using Public Fall Datasets
    Koshmak, Gregory A.
    Linden, Maria
    Hogskolan, Malardalen
    Loutfi, Amy
    2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 752 - 755
  • [8] Learning naive Bayes for probability estimation by feature selection
    Jiang, Liangxiao
    Zhang, Harry
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4013 : 503 - 514
  • [9] Feature (gene) selection in gene expression-based tumor classification
    Xiong, MM
    Li, WJ
    Zhao, JY
    Jin, L
    Boerwinkle, E
    MOLECULAR GENETICS AND METABOLISM, 2001, 73 (03) : 239 - 247
  • [10] Quality of feature selection based on microarray gene expression data
    Maciejewski, Henryk
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 3, 2008, 5103 : 140 - 147