Novel gene-specific Bayesian Gaussian mixture model to predict the missense variants pathogenicity of Sanfilippo syndrome

被引:3
作者
Mohammed, Eman E. A. [1 ]
Fayez, Alaaeldin G. [2 ]
Abdelfattah, Nabil M. [3 ]
Fateen, Ekram [4 ]
机构
[1] Natl Res Ctr, Human Genet & Genome Res Inst, Med Mol Genet Dept, Giza, Egypt
[2] Natl Res Ctr, Human Genet & Genome Res Inst, Mol Genet & Enzymol Dept, Giza, Egypt
[3] Nassers Inst Res & Treatment Hosp, Cairo, Egypt
[4] Natl Res Ctr, Human Genet & Genome Res Inst, Biochem Genet Dept, Giza, Egypt
关键词
Machine-learning model; Sanfilippo syndrome; Missense variants; Pathogenicity prediction; DIAGNOSIS;
D O I
10.1038/s41598-024-62352-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
MPS III is an autosomal recessive lysosomal storage disease caused mainly by missense variants in the NAGLU, GNS, HGSNAT, and SGSH genes. The pathogenicity interpretation of missense variants is still challenging. We aimed to develop unsupervised clustering-based pathogenicity predictor scores using extracted features from eight in silico predictors to predict the impact of novel missense variants of Sanfilippo syndrome. The model was trained on a dataset consisting of 415 uncertain significant (VUS) missense NAGLU variants. Performance The SanfilippoPred tool was evaluated by validation and test datasets consisting of 197-labelled NAGLU missense variants, and its performance was compared versus individual pathogenicity predictors using receiver operating characteristic (ROC) analysis. Moreover, we tested the SanfilippoPred tool using extra-labelled 427 missense variants to assess its specificity and sensitivity threshold. Application of the trained machine learning (ML) model on the test dataset of labelled NAGLU missense variants showed that SanfilippoPred has an accuracy of 0.93 (0.86-0.97 at CI 95%), sensitivity of 0.93, and specificity of 0.92. The comparative performance of the SanfilippoPred showed better performance (AUC = 0.908) than the individual predictors SIFT (AUC = 0.756), Polyphen-2 (AUC = 0.788), CADD (AUC = 0.568), REVEL (AUC = 0.548), MetaLR (AUC = 0.751), and AlphMissense (AUC = 0.885). Using high-confidence labelled NAGLU variants, showed that SanfilippoPred has an 85.7% sensitivity threshold. The poor correlation between the Sanfilippo syndrome phenotype and genotype represents a demand for a new tool to classify its missense variants. This study provides a significant tool for preventing the misinterpretation of missense variants of the Sanfilippo syndrome-relevant genes. Finally, it seems that ML-based pathogenicity predictors and Sanfilippo syndrome-specific prediction tools could be feasible and efficient pathogenicity predictors in the future.
引用
收藏
页数:9
相关论文
共 18 条
[1]   Sanfilippo syndrome: Overall review [J].
Andrade, Fernando ;
Aldamiz-Echevarria, Luis ;
Llarena, Marta ;
Luz Couce, Maria .
PEDIATRICS INTERNATIONAL, 2015, 57 (03) :331-338
[2]   Structural characterization of the α-N-acetylglucosaminidase, a key enzyme in the pathogenesis of Sanfilippo syndrome B [J].
Birrane, Gabriel ;
Dassier, Anne-Laure ;
Romashko, Alla ;
Lundberg, Dianna ;
Holmes, Kevin ;
Cottle, Thomas ;
Norton, Angela W. ;
Zhang, Bohong ;
Concino, Michael F. ;
Meiyappan, Muthuraman .
JOURNAL OF STRUCTURAL BIOLOGY, 2019, 205 (03) :65-71
[3]   Assessment of predicted enzymatic activity of α-N-acetylglucosaminidase variants of unknown significance for CAGI 2016 [J].
Clark, Wyatt T. ;
Kasak, Laura ;
Bakolitsa, Constantina ;
Hu, Zhiqiang ;
Andreoletti, Gaia ;
Babbi, Giulia ;
Bromberg, Yana ;
Casadio, Rita ;
Dunbrack, Roland ;
Folkman, Lukas ;
Ford, Colby T. ;
Jones, David ;
Katsonis, Panagiotis ;
Kundu, Kunal ;
Lichtarge, Olivier ;
Martelli, Pier L. ;
Mooney, Sean D. ;
Nodzak, Conor ;
Pal, Lipika R. ;
Ivojac, Pred Rag Rad ;
Savojardo, Castrense ;
Shi, Xinghua ;
Zhou, Yaoqi ;
Uppal, Aneeta ;
Xu, Qifang ;
Yin, Yizhou ;
Pejaver, Vikas ;
Wang, Meng ;
Wei, Liping ;
Moult, John ;
Yu, Guoying Karen ;
Brenner, Steven E. ;
LeBowitz, Jonathan H. .
HUMAN MUTATION, 2019, 40 (09) :1519-1529
[4]   Heparan sulfate proteoglycans: The sweet side of development turns sour in mucopolysaccharidoses [J].
De Pasquale, Valeria ;
Pavone, Luigi Michele .
BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR BASIS OF DISEASE, 2019, 1865 (11)
[5]   A Novel Machine Learning Based in silico Pathogenicity Predictor for Missense Variants in a Hematological Setting [J].
Hutter, Stephan ;
Baer, Constance ;
Walter, Wencke ;
Kern, Wolfgang ;
Haferlach, Claudia ;
Haferlach, Torsten .
BLOOD, 2019, 134
[6]   REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants [J].
Ioannidis, Nilah M. ;
Rothstein, Joseph H. ;
Pejaver, Vikas ;
Middha, Sumit ;
McDonnell, Shannon K. ;
Baheti, Saurabh ;
Musolf, Anthony ;
Li, Qing ;
Holzinger, Emily ;
Karyadi, Danielle ;
Cannon-Albright, Lisa A. ;
Teerlink, Craig C. ;
Stanford, Janet L. ;
Isaacs, William B. ;
Xu, Jianfeng ;
Cooney, Kathleen A. ;
Lange, Ethan M. ;
Schleutker, Johanna ;
Carpten, John D. ;
Powell, Isaac J. ;
Cussenot, Olivier ;
Cancel-Tassin, Geraldine ;
Giles, Graham G. ;
MacInnis, Robert J. ;
Maier, Christiane ;
Hsieh, Chih-Lin ;
Wiklund, Fredrik ;
Catalona, William J. ;
Foulkes, William D. ;
Mandal, Diptasri ;
Eeles, Rosalind A. ;
Kote-Jarai, Zsofia ;
Bustamante, Carlos D. ;
Schaid, Daniel J. ;
Hastie, Trevor ;
Ostrander, Elaine A. ;
Bailey-Wilson, Joan E. ;
Radivojac, Predrag ;
Thibodeau, Stephen N. ;
Whittemore, Alice S. ;
Sieh, Weiva .
AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 99 (04) :877-885
[7]   Artificial intelligence-based recognition for variant pathogenicity of BRCA1 using AlphaFold2-predicted structures [J].
Li, Chang ;
Zhang, Lili ;
Zhou, Zhonglin ;
Su, Fei ;
Li, Hexin ;
Xu, Siyuan ;
Liu, Ye ;
Zhang, Zaifeng ;
Xie, Yibo ;
Yu, Xue ;
Bian, Liheng ;
Xiao, Fei .
THERANOSTICS, 2023, 13 (01) :391-402
[8]   Gene-Specific Function Prediction for Non-Synonymous Mutations in Monogenic Diabetes Genes [J].
Li, Quan ;
Liu, Xiaoming ;
Gibbs, Richard A. ;
Boerwinkle, Eric ;
Polychronakos, Constantin ;
Qu, Hui-Qi .
PLOS ONE, 2014, 9 (08)
[9]  
MARSH J, 1985, CLIN GENET, V27, P258
[10]   Predicting Genetic Variation Severity Using Machine Learning to Interpret Molecular Simulations [J].
McCoy, Matthew D. ;
Hamre, John, III ;
Klimov, Dmitri K. ;
Jafri, M. Saleet .
BIOPHYSICAL JOURNAL, 2021, 120 (02) :189-204