Using the Random Forest for Identifying Key Physicochemical Properties of Amino Acids to Discriminate Anticancer and Non-Anticancer Peptides

被引:4
作者
Deng, Yiting [1 ]
Ma, Shuhan [1 ]
Li, Jiayu [2 ]
Zheng, Bowen [1 ]
Lv, Zhibin [1 ]
机构
[1] Sichuan Univ, Coll Biomed Engn, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Coll Life Sci, Chengdu 610065, Peoples R China
基金
中国国家自然科学基金;
关键词
random forest; anticancer peptide; amino acids index; physicochemical properties; feature selection; PREDICTION; DATABASE; INDEXES;
D O I
10.3390/ijms241310854
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Anticancer peptides (ACPs) represent a promising new therapeutic approach in cancer treatment. They can target cancer cells without affecting healthy tissues or altering normal physiological functions. Machine learning algorithms have increasingly been utilized for predicting peptide sequences with potential ACP effects. This study analyzed four benchmark datasets based on a well-established random forest (RF) algorithm. The peptide sequences were converted into 566 physicochemical features extracted from the amino acid index (AAindex) library, which were then subjected to feature selection using four methods: light gradient-boosting machine (LGBM), analysis of variance (ANOVA), chi-squared test (Chi(2)), and mutual information (MI). Presenting and merging the identified features using Venn diagrams, 19 key amino acid physicochemical properties were identified that can be used to predict the likelihood of a peptide sequence functioning as an ACP. The results were quantified by performance evaluation metrics to determine the accuracy of predictions. This study aims to enhance the efficiency of designing peptide sequences for cancer treatment.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] AntiCP 2.0: an updated model for predicting anticancer peptides
    Agrawal, Piyush
    Bhagat, Dhruv
    Mahalwal, Manish
    Sharma, Neelam
    Raghava, Gajendra P. S.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [2] Biological Sequence Classification: A Review on Data and General Methods
    Ao, Chunyan
    Jiao, Shihu
    Wang, Yansu
    Yu, Liang
    Zou, Quan
    [J]. RESEARCH, 2022, 2022
  • [3] NmRF: identification of multispecies RNA 2′-O-methylation modification sites from RNA sequences
    Ao, Chunyan
    Zou, Quan
    Yu, Liang
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [4] PRESM: personalized reference editor for somatic mutation discovery in cancer genomics
    Cao, Chen
    Mak, Lauren
    Jin, Guangxu
    Gordon, Paul
    Ye, Kai
    Long, Quan
    [J]. BIOINFORMATICS, 2019, 35 (09) : 1445 - 1452
  • [5] Microarray Analysis Workflow Based on a Genetic Algorithm to Discover Potential Hub Genes
    Carballido, Jessica Andrea
    [J]. CURRENT BIOINFORMATICS, 2022, 17 (09) : 787 - 792
  • [6] Potent antibiotic design via guided search from antibacterial activity evaluations
    Chen, Lu
    Yu, Liang
    Gao, Lin
    [J]. BIOINFORMATICS, 2023, 39 (02)
  • [7] IACP: a sequence-based tool for identifying anticancer peptides
    Chen, Wei
    Ding, Hui
    Feng, Pengmian
    Lin, Hao
    Chou, Kuo-Chen
    [J]. ONCOTARGET, 2016, 7 (13) : 16895 - 16909
  • [8] DeepMC-iNABP: Deep learning for multiclass identification and classification of nucleic acid-binding proteins
    Cui, Feifei
    Li, Shuang
    Zhang, Zilong
    Sui, Miaomiao
    Cao, Chen
    Hesham, Abd El-Latif
    Zou, Quan
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 : 2020 - 2028
  • [9] Protein-DNA/RNA interactions: Machine intelligence tools and approaches in the era of artificial intelligence and big data
    Cui, Feifei
    Zhang, Zilong
    Cao, Chen
    Zou, Quan
    Chen, Dong
    Su, Xi
    [J]. PROTEOMICS, 2022, 22 (08)
  • [10] MTGIpick allows robust identification of genomic islands from a single genome
    Dai, Qi
    Bao, Chaohui
    Hai, Yabing
    Ma, Sheng
    Zhou, Tao
    Wang, Cong
    Wang, Yunfei
    Huo, Wenwen
    Liu, Xiaoqing
    Yao, Yuhua
    Xuan, Zhenyu
    Chen, Min
    Zhang, Michael Q.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2018, 19 (03) : 361 - 373