StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

被引:2
作者
Charoenkwan, Phasit [1 ]
Schaduangrat, Nalini [2 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Res Innovat & Biomed Informat, Bangkok 10700, Thailand
关键词
T-cell antigen; Bioinformatics; Stacking strategy; Feature selection; Machine learning; PREDICTION;
D O I
10.1186/s12859-023-05421-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background : The identification of tumor T cell antigens (TTCAs) is crucial for providing insights into their functional mechanisms and utilizing their potential in anticancer vaccines development. In this context, TTCAs are highly promising. Meanwhile, experimental technologies for discovering and characterizing new TTCAs are expensive and time-consuming. Although many machine learning (ML)-based models have been proposed for identifying new TTCAs, there is still a need to develop a robust model that can achieve higher rates of accuracy and precision.Results : In this study, we propose a new stacking ensemble learning-based framework, termed StackTTCA, for accurate and large-scale identification of TTCAs. Firstly, we constructed 156 different baseline models by using 12 different feature encoding schemes and 13 popular ML algorithms. Secondly, these baseline models were trained and employed to create a new probabilistic feature vector. Finally, the optimal probabilistic feature vector was determined based the feature selection strategy and then used for the construction of our stacked model. Comparative benchmarking experiments indicated that StackTTCA clearly outperformed several ML classifiers and the existing methods in terms of the independent test, with an accuracy of 0.932 and Matthew's correlation coefficient of 0.866.Conclusions : In summary, the proposed stacking ensemble learning-based framework of StackTTCA could help to precisely and rapidly identify true TTCAs for follow-up experimental verification. In addition, we developed an online web server () to maximize user convenience for high-throughput screening of novel TTCAs.
引用
收藏
页数:16
相关论文
共 48 条
  • [1] SCORPION is a stacking-based ensemble learning framework for accurate prediction of phage virion proteins
    Ahmad, Saeed
    Charoenkwan, Phasit
    Quinn, Julian M. W.
    Moni, Mohammad Ali
    Hasan, Md Mehedi
    Lio, Pietro
    Shoombuatong, Watshara
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01):
  • [2] MHC-II neoantigens shape tumour immunity and response to immunotherapy
    Alspach, Elise
    Lussier, Danielle M.
    Miceli, Alexander P.
    Kizhvatov, Ilya
    DuPage, Michel
    Luoma, Adrienne M.
    Meng, Wei
    Lichti, Cheryl F.
    Esaulova, Ekaterina
    Vomund, Anthony N.
    Runci, Daniele
    Ward, Jeffrey P.
    Gubin, Matthew M.
    Medrano, Ruan F. V.
    Arthur, Cora D.
    White, J. Michael
    Sheehan, Kathleen C. F.
    Chen, Alex
    Wucherpfennig, Kai W.
    Jacks, Tyler
    Unanue, Emil R.
    Artyomov, Maxim N.
    Schreiber, Robert D.
    [J]. NATURE, 2019, 574 (7780) : 696 - +
  • [3] CAR T-cell Therapy: A New Era in Cancer Immunotherapy
    Androulla, Miliotou N.
    Lefkothea, Papadopoulou C.
    [J]. CURRENT PHARMACEUTICAL BIOTECHNOLOGY, 2018, 19 (01) : 5 - 18
  • [4] Estimating confidence intervals for information transfer analysis of confusion matrices
    Azadpour, Mahan
    McKay, Colette M.
    Smith, Robert L.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 135 (03) : EL140 - EL146
  • [5] Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening
    Basith, Shaherin
    Manavalan, Balachandran
    Shin, Tae Hwan
    Lee, Gwang
    [J]. MEDICINAL RESEARCH REVIEWS, 2020, 40 (04) : 1276 - 1314
  • [6] TTAgP 1.0: A computational tool for the specific prediction of tumor T cell antigens
    Beltran Lissabet, Jorge Felix
    Herrera Belen, Lisandra
    Farias, Jorge G.
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 83
  • [7] Breckpot Karine, 2009, Endocrine Metabolic & Immune Disorders-Drug Targets, V9, P328
  • [8] Properties of MHC Class I Presented Peptides That Enhance Immunogenicity
    Calis, Jorg J. A.
    Maybeno, Matt
    Greenbaum, Jason A.
    Weiskopf, Daniela
    De Silva, Aruna D.
    Sette, Alessandro
    Kesmir, Can
    Peters, Bjoern
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (10)
  • [9] Charoenkwan P., 2022, Scientific reports, V12, P1, DOI [DOI 10.1038/S41598-022-11897-Z, 10.1038/s41598-022-11897-z]
  • [10] PSRTTCA: A new approach for improving the prediction and characterization of tumor T cell antigens using propensity score representation learning
    Charoenkwan, Phasit
    Pipattanaboon, Chonlatip
    Nantasenamat, Chanin
    Hasan, Md Mehedi
    Moni, Mohammad Ali
    Lio, Pietro
    Shoombuatong, Watshara
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 152