Integration of deep learning with Ramachandran plot molecular dynamics simulation for genetic variant classification

被引:5
作者
Tam, Benjamin [1 ,2 ,3 ]
Qin, Zixin [1 ,2 ,3 ]
Zhao, Bojin [1 ,2 ,3 ]
Wang, San Ming [1 ,2 ,3 ]
Lei, Chon Lok [1 ,2 ,3 ]
机构
[1] Univ Macau, Fac Hlth Sci, Frontiers Sci Ctr Precis Oncol, Minist Educ, Macau, Peoples R China
[2] Univ Macau, Fac Hlth Sci, Canc Ctr, Macau, Peoples R China
[3] Univ Macau, Inst Translat Med, Fac Hlth Sci, Macau, Peoples R China
关键词
FUNCTIONAL IMPACT; PROTEIN; PREDICTION; CONSTRAINT; MUTATIONS; EVOLUTION; ACCURACY; REPAIR; DOMAIN; SIFT;
D O I
10.1016/j.isci.2023.106122
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Functional classification of genetic variants is a key for their clinical applications in patient care. However, abundant variant data generated by the next-generation DNA sequencing technologies limit the use of experimental methods for their classification. Here, we developed a protein structure and deep learning (DL)-based system for genetic variant classification, DL-RP-MDS, which comprises two principles: 1) Extracting protein structural and thermodynamics information using the Ramachandran plot-molecular dynamics simulation (RP-MDS) method, 2) combining those data with an unsupervised learning model of auto-encoder and a neural network classifier to identify the statistical significance patterns of the structural changes. We observed that DL-RP-MDS provided higher specificity than over 20 widely used in silico methods in classifying the variants of three DNA damage repair genes: TP53, MLH1, and MSH2. DL-RP-MDS offers a powerful platform for high-throughput genetic variant classification. The software and online application are available at https://genemutation.fhs.um.edu.mo/DL-RP-MDS/.
引用
收藏
页数:17
相关论文
共 71 条
[1]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[2]   Construction of the free energy landscape of biomolecules via dihedral angle principal component analysis [J].
Altis, Alexandros ;
Otten, Moritz ;
Nguyen, Phuong H. ;
Hegger, Rainer ;
Stock, Gerhard .
JOURNAL OF CHEMICAL PHYSICS, 2008, 128 (24)
[3]   ASYMPTOTIC THEORY OF CERTAIN GOODNESS OF FIT CRITERIA BASED ON STOCHASTIC PROCESSES [J].
ANDERSON, TW ;
DARLING, DA .
ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (02) :193-212
[4]  
[Anonymous], CLINVAR DATABASE
[5]  
[Anonymous], Leiden Open Variation Database (LOVD) - Ataxia Telangiectasia Mutated (ATM)
[6]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[7]   GROMACS - A MESSAGE-PASSING PARALLEL MOLECULAR-DYNAMICS IMPLEMENTATION [J].
BERENDSEN, HJC ;
VANDERSPOEL, D ;
VANDRUNEN, R .
COMPUTER PHYSICS COMMUNICATIONS, 1995, 91 (1-3) :43-56
[8]   Interpretation of protein folding ψ values [J].
Bodenreider, C ;
Kiefhaber, T .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 351 (02) :393-401
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]   Distribution and intensity of constraint in mammalian genomic sequence [J].
Cooper, GM ;
Stone, EA ;
Asimenos, G ;
Green, ED ;
Batzoglou, S ;
Sidow, A .
GENOME RESEARCH, 2005, 15 (07) :901-913