Triple and quadruple optimization for feature selection in cancer biomarker discovery

被引:1
作者
Cattelani, L. [1 ]
Fortino, V. [1 ]
机构
[1] Univ Eastern Finland, Sch Med, Inst Biomed, Kuopio 70210, Finland
基金
芬兰科学院;
关键词
Triple and quadruple optimization; Feature selection; Biomarker discovery; ALGORITHM; MARKER;
D O I
10.1016/j.jbi.2024.104736
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The proliferation of omics data has advanced cancer biomarker discovery but often falls short in external validation, mainly due to a narrow focus on prediction accuracy that neglects clinical utility and validation feasibility. We introduce three- and four-objective optimization strategies based on genetic algorithms to identify clinically actionable biomarkers in omics studies, addressing classification tasks aimed at distinguishing hard-todifferentiate cancer subtypes beyond histological analysis alone. Our hypothesis is that by optimizing more than one characteristic of cancer biomarkers, we may identify biomarkers that will enhance their success in external validation. Our objectives are to: (i) assess the biomarker panel's accuracy using a machine learning (ML) framework; (ii) ensure the biomarkers exhibit significant fold-changes across subtypes, thereby boosting the success rate of PCR or immunohistochemistry validations; (iii) select a concise set of biomarkers to simplify the validation process and reduce clinical costs; and (iv) identify biomarkers crucial for predicting overall survival, which plays a significant role in determining the prognostic value of cancer subtypes. We implemented and applied triple and quadruple optimization algorithms to renal carcinoma gene expression data from TCGA. The study targets kidney cancer subtypes that are difficult to distinguish through histopathology methods. Selected RNA-seq biomarkers were assessed against the gold standard method, which relies solely on clinical information, and in external microarray-based validation datasets. Notably, these biomarkers achieved over 0.8 of accuracy in external validations and added significant value to survival predictions, outperforming the use of clinical data alone with a superior c-index. The provided tool also helps explore the trade-off between objectives, offering multiple solutions for clinical evaluation before proceeding to costly validation or clinical trials.
引用
收藏
页数:8
相关论文
共 26 条
  • [1] A multi-objective optimization algorithm for feature selection problems
    Abdollahzadeh, Benyamin
    Gharehchopogh, Farhad Soleimanian
    [J]. ENGINEERING WITH COMPUTERS, 2022, 38 (SUPPL 3) : 1845 - 1863
  • [2] Oncocytoma-Related Gene Signature to Differentiate Chromophobe Renal Cancer and Oncocytoma Using Machine Learning
    Bin Satter, Khaled
    Tran, Paul Minh Huy
    Tran, Lynn Kim Hoang
    Ramsey, Zach
    Pinkerton, Katheine
    Bai, Shan
    Savage, Natasha M.
    Kavuri, Sravan
    Terris, Martha K.
    She, Jin-Xiong
    Purohit, Sharad
    [J]. CELLS, 2022, 11 (02)
  • [3] Cattelani L., 2023, TechRxiv, DOI [10.36227/techrxiv.24321154, DOI 10.36227/TECHRXIV.24321154]
  • [4] Improved NSGA-II algorithms for multi-objective biomarker discovery
    Cattelani, Luca
    Fortino, Vittorio
    [J]. BIOINFORMATICS, 2022, 38 : ii20 - ii26
  • [5] An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints
    Deb, Kalyanmoy
    Jain, Himanshu
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2014, 18 (04) : 577 - 601
  • [6] The failure of protein cancer biomarkers to reach the clinic: why, and what can be done to address the problem?
    Diamandis, Eleftherios P.
    [J]. BMC MEDICINE, 2012, 10
  • [7] Gene selection and classification of microarray data using random forest -: art. no. 3
    Díaz-Uriarte, R
    de Andrés, SA
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [8] Feature set optimization in biomarker discovery from genome-scale data
    Fortino, V.
    Scala, G.
    Greco, D.
    [J]. BIOINFORMATICS, 2020, 36 (11) : 3393 - 3400
  • [9] Biomarkers of nanomaterials hazard from multi-layer data
    Fortino, Vittorio
    Kinaret, Pia Anneli Sofia
    Fratello, Michele
    Serra, Angela
    Saarimaki, Laura Aliisa
    Gallud, Audrey
    Gupta, Govind
    Vales, Gerard
    Correia, Manuel
    Rasool, Omid
    Ytterberg, Jimmy
    Monopoli, Marco
    Skoog, Tiina
    Ritchie, Peter
    Moya, Sergio
    Vazquez-Campos, Socorro
    Handy, Richard
    Grafstrom, Roland
    Tran, Lang
    Zubarev, Roman
    Lahesmaa, Riitta
    Dawson, Kenneth
    Loeschner, Katrin
    Larsen, Erik Husfeldt
    Krombach, Fritz
    Norppa, Hannu
    Kere, Juha
    Savolainen, Kai
    Alenius, Harri
    Fadeel, Bengt
    Greco, Dario
    [J]. NATURE COMMUNICATIONS, 2022, 13 (01)
  • [10] Machine-learning-driven biomarker discovery for the discrimination between allergic and irritant contact dermatitis
    Fortino, Vittorio
    Wisgrill, Lukas
    Werner, Paulina
    Suomela, Sari
    Linder, Nina
    Jalonen, Erja
    Suomalainen, Alina
    Marwah, Veer
    Kero, Mia
    Pesonen, Maria
    Lundin, Johan
    Lauerma, Antti
    Aalto-Korte, Kristiina
    Greco, Dario
    Alenius, Harri
    Fyhrquist, Nanna
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (52) : 33474 - 33485