Assessing predictions on fitness effects of missense variants in HMBS in CAGI6

被引:0
作者
Zhang, Jing [1 ,2 ,3 ,4 ]
Kinch, Lisa [5 ,6 ]
Katsonis, Panagiotis [7 ]
Lichtarge, Olivier [7 ]
Jagota, Milind [8 ]
Song, Yun S. [8 ,9 ]
Sun, Yuanfei [10 ]
Shen, Yang [10 ]
Kuru, Nurdan [11 ]
Dereli, Onur [11 ]
Adebali, Ogun [11 ]
Alladin, Muttaqi Ahmad [12 ]
Pal, Debnath [12 ]
Capriotti, Emidio [13 ]
Turina, Maria Paola [13 ]
Savojardo, Castrense [13 ]
Martelli, Pier Luigi [13 ]
Babbi, Giulia [13 ]
Casadio, Rita [13 ]
Pucci, Fabrizio [14 ]
Rooman, Marianne [14 ]
Cia, Gabriel [14 ]
Tsishyn, Matsvei [14 ]
Strokach, Alexey [15 ]
Hu, Zhiqiang [16 ,17 ]
van Loggerenberg, Warren [18 ,19 ,20 ,21 ]
Roth, Frederick P. [18 ,19 ,20 ,21 ]
Radivojac, Predrag [22 ]
Brenner, Steven E. [16 ,17 ,23 ]
Cong, Qian [1 ,2 ,3 ,4 ]
Grishin, Nick V. [1 ,2 ]
机构
[1] Univ Texas Southwestern Med Ctr, Dept Biophys, Dallas, TX 75390 USA
[2] Univ Texas Southwestern Med Ctr, Dept Biochem, Dallas, TX 75390 USA
[3] Univ Texas Southwestern Med Ctr, Eugene McDermott Ctr Human Growth & Dev, Dallas, TX 75390 USA
[4] Univ Texas Southwestern Med Ctr, Harold C Simmons Comprehens Canc Ctr, Dallas, TX 75390 USA
[5] Univ Texas Southwestern Med Ctr, Howard Hughes Med Inst, Dallas, TX 75390 USA
[6] Univ Texas Southwestern Med Ctr, Dept Mol Biol, Dallas, TX 75390 USA
[7] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[8] Univ Calif Berkeley, Comp Sci Div, Berkeley, CA 94720 USA
[9] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[10] Texas A&M Univ, Dept Elect & Comp Engn, College Stn, TX 77843 USA
[11] Sabanci Univ, Fac Engn & Nat Sci, Tuzla, Turkiye
[12] Indian Inst Sci, Dept Computat & Data Sci, Bangaluru 560012, India
[13] Univ Bologna, Dept Pharm & Biotechnol, Via Selmi 3, I-40126 Bologna, Italy
[14] Univ Libre Bruxelles, Computat Biol & Bioinformat, 50 Roosevelt Ave, B-1050 Brussels, Belgium
[15] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 2E4, Canada
[16] Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA
[17] Univ Calif Berkeley, Ctr Computat Biol, Berkeley, CA 94720 USA
[18] Univ Pittsburgh, Sch Med, Dept Computat & Syst Biol, Pittsburgh, PA 15213 USA
[19] Univ Toronto, Donnelly Ctr, Toronto, ON M5S 3E1, Canada
[20] Univ Toronto, Dept Mol Genet, Toronto, ON M5S 1A8, Canada
[21] Sinai Hlth, Lunenfeld Tanenbaum Res Inst, Toronto, ON M5G 1X5, Canada
[22] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
[23] Univ Calif Berkeley, Biophys Grad Grp, Berkeley, CA 94720 USA
关键词
EVOLUTIONARY ACTION; WEB SERVER; PROTEIN-SEQUENCE; MUTATIONS; STABILITY; TOOL; INHERITANCE; EQUATION;
D O I
10.1007/s00439-024-02680-3
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
This paper presents an evaluation of predictions submitted for the "HMBS" challenge, a component of the sixth round of the Critical Assessment of Genome Interpretation held in 2021. The challenge required participants to predict the effects of missense variants of the human HMBS gene on yeast growth. The HMBS enzyme, critical for the biosynthesis of heme in eukaryotic cells, is highly conserved among eukaryotes. Despite the application of a variety of algorithms and methods, the performance of predictors was relatively similar, with Kendall's tau correlation coefficients between predictions and experimental scores around 0.3 for a majority of submissions. Notably, the median correlation (>= 0.34) observed among these predictors, especially the top predictions from different groups, was greater than the correlation observed between their predictions and the actual experimental results. Most predictors were moderately successful in distinguishing between deleterious and benign variants, as evidenced by an area under the receiver operating characteristic (ROC) curve (AUC) of approximately 0.7 respectively. Compared with the recent two rounds of CAGI competitions, we noticed more predictors outperformed the baseline predictor, which is solely based on the amino acid frequencies. Nevertheless, the overall accuracy of predictions is still far short of positive control, which is derived from experimental scores, indicating the necessity for considerable improvements in the field. The most inaccurately predicted variants in this round were associated with the insertion loop, which is absent in many orthologs, suggesting the predictors still heavily rely on the information from multiple sequence alignment.
引用
收藏
页码:173 / 189
页数:17
相关论文
共 72 条
  • [11] PhD-SNPg: a webserver and lightweight tool for scoring single nucleotide variants
    Capriotti, Emidio
    Fariselli, Piero
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (W1) : W247 - W252
  • [12] Improving the prediction of disease-related variants using protein three-dimensional structure
    Capriotti, Emidio
    Altman, Russ B.
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [13] PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels
    Choi, Yongwook
    Chan, Agnes P.
    [J]. BIOINFORMATICS, 2015, 31 (16) : 2745 - 2747
  • [14] Consortium IGVF, 2023, Arxiv, DOI [arXiv:2307.13708, 10.48550/arXiv.2307.13708, DOI 10.48550/ARXIV.2307.13708]
  • [15] Distribution and intensity of constraint in mammalian genomic sequence
    Cooper, GM
    Stone, EA
    Asimenos, G
    Green, ED
    Batzoglou, S
    Sidow, A
    [J]. GENOME RESEARCH, 2005, 15 (07) : 901 - 913
  • [16] PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality
    Dehouck, Yves
    Kwasigroch, Jean Marc
    Gilis, Dimitri
    Rooman, Marianne
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [17] Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0
    Dehouck, Yves
    Grosfils, Aline
    Folch, Benjamin
    Gilis, Dimitri
    Bogaerts, Philippe
    Rooman, Marianne
    [J]. BIOINFORMATICS, 2009, 25 (19) : 2537 - 2543
  • [18] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [19] ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning
    Elnaggar, Ahmed
    Heinzinger, Michael
    Dallago, Christian
    Rehawi, Ghalia
    Wang, Yu
    Jones, Llion
    Gibbs, Tom
    Feher, Tamas
    Angerer, Christoph
    Steinegger, Martin
    Bhowmik, Debsindhu
    Rost, Burkhard
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7112 - 7127
  • [20] CAN CALMODULIN FUNCTION WITHOUT BINDING CALCIUM
    GEISER, JR
    VANTUINEN, D
    BROCKERHOFF, SE
    NEFF, MM
    DAVIS, TN
    [J]. CELL, 1991, 65 (06) : 949 - 959