FRL: An Integrative Feature Selection Algorithm Based on the Fisher Score, Recursive Feature Elimination, and Logistic Regression to Identify Potential Genomic Biomarkers

被引:3
|
作者
Ge, Chenyu [1 ]
Luo, Liqun [2 ]
Zhang, Jialin [3 ]
Meng, Xiangbing [4 ]
Chen, Yun [5 ]
机构
[1] Shandong Univ, Sch Mech Elect & Informat Engn, Jinan 250000, Peoples R China
[2] Peking Univ, Dept Informat Management, Beijing 100000, Peoples R China
[3] Paris Saclay Univ, Lab Rech Informat, F-91405 Paris, France
[4] Qufu Inst Tradit Chinese Med Hlth & Rehabil, Qufu 273100, Shandong, Peoples R China
[5] Shandong Univ TCM, Hosp 2, Jinan 250000, Peoples R China
关键词
PRECISION MEDICINE; CANCER; CLASSIFICATION; PROGRESSION; PROLIFERATION; SIGNATURE; HSPB8;
D O I
10.1155/2021/4312850
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Accurate screening on cancer biomarkers contributes to health assessment, drug screening, and targeted therapy for precision medicine. The rapid development of high-throughput sequencing technology has identified abundant genomic biomarkers, but most of them are limited to single-cancer analysis. Based on the combination of Fisher score, Recursive feature elimination, and Logistic regression (FRL), this paper proposes an integrative feature selection algorithm named FRL to explore potential cancer genomic biomarkers on cancer subsets. Fisher score is initially used to calculate the weights of genes to rapidly reduce the dimension. Recursive feature elimination and Logistic regression are then jointly employed to extract the optimal subset. Compared to the current differential expression analysis tool GEO2R based on the Limma algorithm, FRL has greater classification precision than Limma. Compared with five traditional feature selection algorithms, FRL exhibits excellent performance on accuracy (ACC) and F1-score and greatly improves computational efficiency. On high-noise datasets such as esophageal cancer, the ACC of FRL is 30% superior to the average ACC achieved with other traditional algorithms. As biomarkers found in multiple studies are more reliable and reproducible, and reveal stronger association on potential clinical value than single analysis, through literature review and spatial analyses of gene functional enrichment and functional pathways, we conduct cluster analysis on 10 diverse cancers with high mortality and form a potential biomarker module comprising 19 genes. All genes in this module can serve as potential biomarkers to provide more information on the overall oncogenesis mechanism for the detection of diverse early cancers and assist in targeted anticancer therapies for further developments in precision medicine.
引用
收藏
页数:16
相关论文
共 26 条
  • [1] Improving firefly algorithm-based logistic regression for feature selection
    Kahya, Mohammed Abdulrazaq
    Altamir, Suhaib Abduljabbar
    Algamal, Zakariya Yahya
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2019, 22 (08) : 1577 - 1581
  • [2] WERFE: A Gene Selection Algorithm Based on Recursive Feature Elimination and Ensemble Strategy
    Chen, Qi
    Meng, Zhaopeng
    Su, Ran
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2020, 8
  • [3] EEG Feature Selection for Emotion Recognition Based on Cross-subject Recursive Feature Elimination
    Zhang, Wei
    Yin, Zhong
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 6256 - 6261
  • [4] An efficient model selection for linear discriminant function-based recursive feature elimination
    Ding, Xiaojian
    Yang, Fan
    Ma, Fuming
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 129
  • [5] Multinomial logistic regression-based feature selection for hyperspectral data
    Pal, Mahesh
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2012, 14 (01): : 214 - 220
  • [6] Feature selection with the Fisher score followed by the Maximal Clique Centrality algorithm can accurately identify the hub genes of hepatocellular carcinoma
    Li, Chengzhang
    Xu, Jiucheng
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [7] A Hybrid Feature Selection Approach for Parkinson's Detection Based on Mutual Information Gain and Recursive Feature Elimination
    Lamba, Rohit
    Gulati, Tarun
    Jain, Anurag
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 10263 - 10276
  • [8] Feature Selection Algorithm Based on Sparse Score and Correlation Analysis
    Xue, Shanliang
    Cheng, Sijia
    Li, Mengying
    Yuan, Yong
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 744 - 751
  • [9] A feature selection-based framework to identify biomarkers for cancer diagnosis: A focus on lung adenocarcinoma
    Abdelwahab, Omar
    Awad, Nourelislam
    Elserafy, Menattallah
    Badr, Eman
    PLOS ONE, 2022, 17 (09):
  • [10] MGRFE: Multilayer Recursive Feature Elimination Based on an Embedded Genetic Algorithm for Cancer Classification
    Peng, Cheng
    Wu, Xinyu
    Yuan, Wen
    Zhang, Xinran
    Zhang, Yu
    Li, Ying
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (02) : 621 - 632