PmmNDD: Predicting the Pathogenicity of Missense Mutations in Neurodegenerative Diseases via Ensemble Learning

被引:0
作者
Li, Xijian [1 ]
Huang, Ying [2 ]
Tang, Runxuan [1 ]
Xiao, Guangcheng [1 ]
Chen, Xiaochuan [1 ]
He, Ruilin [1 ]
Zhang, Zhaolei [3 ,4 ,5 ]
Luo, Jiana [1 ]
Wei, Yanjie [2 ]
Mao, Yijun [1 ]
Zhang, Huiling [1 ]
机构
[1] South China Agr Univ, Coll Math & Informat, Guangzhou 510642, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[3] Univ Toronto, Dept Mol Genet, Toronto, ON M1C 1A4, Canada
[4] Univ Toronto, Donnelly Ctr Cellular & Biomol Res, Toronto, ON M1C 1A4, Canada
[5] Univ Toronto, Dept Comp Sci, Toronto, ON M1C 1A4, Canada
来源
BIOINFORMATICS RESEARCH AND APPLICATIONS, PT III, ISBRA 2024 | 2024年 / 14956卷
基金
国家重点研发计划; 美国国家科学基金会;
关键词
missense mutation; neurodegenerative diseases; ensemble learning; mutation interpretation; DELETERIOUSNESS; DATABASE; IMPACT;
D O I
10.1007/978-981-97-5087-0_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurately distinguishing between pathogenic and benign mutations continues to pose a significant challenge in the clinical genetic testing of patients with neurodegenerative diseases (NDDs). In theory, computational methods have the potential to facilitate the interpretation of genetic variants in NDDs on a large scale. However, individual tools often exhibit disagreements, biases, and variations in quality. As a result, the predictions derived from them are considered insufficiently reliable. In this study, we developed PmmNDD, an ensemble method for predicting pathogenicity of missense variants in NDDs. PmmNDD integrated the prediction scores from other methods along with amino acid characteristics as features, and was constructed with the categorical boosting (CatBoost) model. The stability and generalization ability of PmmNDD were validated through leave-one-gene-out cross-validation and independent test. We also demonstrated Pmm-NDD's superior performance over 20 other methods. Furthermore, we provided pre-computed PmmNDD scores for all possible NDDs missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale. In summary, our work suggests that models from ensemble learning can provide valuable independent evidence for NDD mutation interpretation that will be widely useful in research and clinical scenarios.
引用
收藏
页码:64 / 75
页数:12
相关论文
共 10 条
  • [1] LYRUS: a machine learning model for predicting the pathogenicity of missense variants
    Lai, Jiaying
    Yang, Jordan
    Gamsiz Uzun, Ece D.
    Rubenstein, Brenda M.
    Sarkar, Indra Neil
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [2] Characterization on the oncogenic effect of the missense mutations of p53 via machine learning
    Pan, Qisheng
    Portelli, Stephanie
    Nguyen, Thanh Binh
    Ascher, David B.
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [3] NDDNet: a deep learning model for predicting neurodegenerative diseases from gait pattern
    Md. Ahasan Atick Faisal
    Muhammad E. H. Chowdhury
    Zaid Bin Mahbub
    Shona Pedersen
    Mosabber Uddin Ahmed
    Amith Khandakar
    Mohammed Alhatou
    Mohammad Nabil
    Iffat Ara
    Enamul Haque Bhuiyan
    Sakib Mahmud
    Mohammed AbdulMoniem
    Applied Intelligence, 2023, 53 : 20034 - 20046
  • [4] Predicting Liquidity Ratio of Mutual Funds via Ensemble Learning
    Kong, Kun
    Liu, Ruicong
    Zhang, Yihui
    Chen, Yixin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5441 - 5450
  • [5] NDDNet: a deep learning model for predicting neurodegenerative diseases from gait pattern
    Faisal, Md. Ahasan Atick
    Chowdhury, Muhammad E. H.
    Mahbub, Zaid Bin
    Pedersen, Shona
    Ahmed, Mosabber Uddin
    Khandakar, Amith
    Alhatou, Mohammed
    Nabil, Mohammad
    Ara, Iffat
    Bhuiyan, Enamul Haque
    Mahmud, Sakib
    AbdulMoniem, Mohammed
    APPLIED INTELLIGENCE, 2023, 53 (17) : 20034 - 20046
  • [6] Structure-based pathogenicity relationship identifier for predicting effects of single missense variants and discovery of higher-order cancer susceptibility clusters of mutations
    Wang, Boshen
    Lei, Xue
    Tian, Wei
    Perez-Rathke, Alan
    Tseng, Yan-Yuan
    Liang, Jie
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)
  • [7] Classification of neurodegenerative diseases using gait dynamics via deterministic learning
    Zeng, Wei
    Wang, Cong
    INFORMATION SCIENCES, 2015, 317 : 246 - 258
  • [8] Predicting health effects of food compounds via ensemble machine learning
    Mei, Suyu
    INTERNATIONAL JOURNAL OF FOOD SCIENCE AND TECHNOLOGY, 2024, 59 (04) : 2547 - 2557
  • [9] Predicting small RNAs in bacteria via sequence learning ensemble method
    Zhang, Wen
    Shi, Jingwen
    Tang, Guifeng
    Wu, Wenjian
    Yue, Xiang
    Li, Dingfang
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 643 - 647
  • [10] An ensemble machine learning-based performance evaluation identifies top In-Silico pathogenicity prediction methods that best classify driver mutations in cancer
    Das, Subrata
    Patel, Vatsal
    Chakravarty, Shouvik
    Ghosh, Arnab
    Mukhopadhyay, Anirban
    Biswas, Nidhan K.
    BIODATA MINING, 2025, 18 (01):