Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data

被引:5
作者
Bryce-Smith, Sam [1 ]
Burri, Dominik [2 ,3 ]
Gazzara, Matthew R. [4 ]
Herrmann, Christina J. [2 ,3 ]
Danecka, Weronika [5 ]
Fitzsimmons, Christina M. [6 ]
Wan, Yuk Kei [7 ,8 ]
Zhuang, Farica [9 ]
Fansler, Mervin M. [10 ,11 ]
Fernandez, Jose M. [12 ,13 ]
Ferret, Meritxell [12 ,13 ]
Gonzalez-Uriarte, Asier [12 ,13 ]
Haynes, Samuel [5 ]
Herdman, Chelsea [14 ]
Kanitz, Alexander [2 ,3 ]
Katsantoni, Maria [2 ,3 ]
Marini, Federico [15 ]
McDonnel, Euan [16 ]
Nicolet, Ben [17 ,18 ]
Poon, Chi-Lam [19 ]
Rot, Gregor [3 ,20 ]
Scharfen, Leonard [21 ]
Wu, Pin-Jou [22 ]
Yoon, Yoseop [23 ]
Barash, Yoseph [4 ,9 ]
Zavolan, Mihaela [2 ,3 ]
机构
[1] UCL, UCL Queen Sq Inst Neurol, Dept Neuromuscular Dis, UCL Queen Sq Motor Neuron Dis Ctr, London, England
[2] Univ Basel, Biozentrum, Basel, Switzerland
[3] Swiss Inst Bioinformat, Lausanne, Switzerland
[4] Univ Penn, Perelman Sch Med, Dept Genet, Philadelphia, PA 19104 USA
[5] Univ Edinburgh, Inst Cell Biol, Sch Biol Sci, Edinburgh EH9 3FF, Midlothian, Scotland
[6] NCI, Lab Cell Biol, Ctr Canc Res, NIH, Bethesda, MD 20892 USA
[7] Genome Inst Singapore, Buona Vista 138672, Singapore
[8] Natl Univ Singapore, Yong Loo Lin Sch Med, Kent Ridge 119228, Singapore
[9] Univ Penn, Sch Engn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[10] Weill Cornell Grad Studies, Triinst Program Computat Biol & Med, New York, NY 10065 USA
[11] MSKCC, Sloan Kettering Inst, Canc Biol & Genet, New York, NY 10065 USA
[12] Barcelona Supercomp Ctr, Barcelona, Spain
[13] Spanish Natl Bioinformat Inst INB, ELIXIR ES, Madrid, Spain
[14] Univ Utah, Dept Neurobiol, Salt Lake City, UT 84132 USA
[15] Johannes Gutenberg Univ Mainz, Univ Med Ctr, Inst Med Biostat Epidemiol & Informat IMBEI, Mainz, Germany
[16] Univ Leeds, Leeds Inst Data Analyt, Sch Mol & Cellular Biol, Leeds LS2 9NL, England
[17] Univ Amsterdam, Dept Hematopoiesis, Landsteiner Lab, Sanquin Res,Amsterdam UMC, Amsterdam, Netherlands
[18] Oncode Inst, Utrecht, Netherlands
[19] Weill Cornell Med, Grad Sch Med Sci, New York, NY 10065 USA
[20] Univ Zurich, Inst Mol Life Sci, Zurich, Switzerland
[21] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[22] Univ Tubingen, Ctr Plant Mol Biol ZMBP, Tubingen, Germany
[23] Univ Calif Irvine, Sch Med, Dept Microbiol & Mol Genet, Irvine, CA 92617 USA
基金
欧盟地平线“2020”; 美国国家卫生研究院; 瑞士国家科学基金会;
关键词
benchmarking; (alternative) polyadenylation; bioinformatics; RNA-seq; community initiative; GENOME-WIDE ANALYSIS; ALTERNATIVE POLYADENYLATION; CLEAVAGE; EXPRESSION; REVEALS; GENES; ENDS;
D O I
10.1261/rna.079849.123
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3 '-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
引用
收藏
页码:1839 / 1855
页数:17
相关论文
共 54 条
  • [1] The GTEx Consortium atlas of genetic regulatory effects across human tissues
    Aguet, Francois
    Barbeira, Alvaro N.
    Bonazzola, Rodrigo
    Brown, Andrew
    Castel, Stephane E.
    Jo, Brian
    Kasela, Silva
    Kim-Hellmuth, Sarah
    Liang, Yanyu
    Parsana, Princy
    Flynn, Elise
    Fresard, Laure
    Gamazon, Eric R.
    Hamel, Andrew R.
    He, Yuan
    Hormozdiari, Farhad
    Mohammadi, Pejman
    Munoz-Aguirre, Manuel
    Ardlie, Kristin G.
    Battle, Alexis
    Bonazzola, Rodrigo
    Brown, Christopher D.
    Cox, Nancy
    Dermitzakis, Emmanouil T.
    Engelhardt, Barbara E.
    Garrido-Martin, Diego
    Gay, Nicole R.
    Getz, Gad
    Guigo, Roderic
    Hamel, Andrew R.
    Handsaker, Robert E.
    He, Yuan
    Hoffman, Paul J.
    Hormozdiari, Farhad
    Im, Hae Kyung
    Jo, Brian
    Kasela, Silva
    Kashin, Seva
    Kim-Hellmuth, Sarah
    Kwong, Alan
    Lappalainen, Tuuli
    Li, Xiao
    Liang, Yanyu
    MacArthur, Daniel G.
    Mohammadi, Pejman
    Montgomery, Stephen B.
    Munoz-Aguirre, Manuel
    Rouhana, John M.
    Hormozdiari, Farhad
    Im, Hae Kyung
    [J]. SCIENCE, 2020, 369 (6509) : 1318 - 1330
  • [2] TAPAS: tool for alternative polyadenylation site analysis
    Arefeen, Ashraful
    Liu, Juntao
    Xiao, Xinshu
    Jiang, Tao
    [J]. BIOINFORMATICS, 2018, 34 (15) : 2521 - 2529
  • [3] Capella-Gutierrez S., 2017, BIORXIV, DOI [10.1101/181677, DOI 10.1101/181677]
  • [4] mountainClimber Identifies Alternative Transcription Start and Polyadenylation Sites in RNA-Seq
    Cass, Ashley A.
    Xiao, Xinshu
    [J]. CELL SYSTEMS, 2019, 9 (04) : 393 - +
  • [5] A survey on identification and quantification of alternative polyadenylation sites from RNA-seq data
    Chen, Moliang
    Ji, Guoli
    Fu, Hongjuan
    Lin, Qianmin
    Ye, Congting
    Ye, Wenbin
    Su, Yaru
    Wu, Xiaohui
    [J]. BRIEFINGS IN BIOINFORMATICS, 2020, 21 (04) : 1261 - 1276
  • [6] A quantitative atlas of polyadenylation in five mammals
    Derti, Adnan
    Garrett-Engele, Philip
    MacIsaac, Kenzie D.
    Stevens, Richard C.
    Sriram, Shreedharan
    Chen, Ronghua
    Rohl, Carol A.
    Johnson, Jason M.
    Babak, Tomas
    [J]. GENOME RESEARCH, 2012, 22 (06) : 1173 - 1183
  • [7] Nextflow enables reproducible computational workflows
    Di Tommaso, Paolo
    Chatzou, Maria
    Floden, Evan W.
    Prieto Barja, Pablo
    Palumbo, Emilio
    Notredame, Cedric
    [J]. NATURE BIOTECHNOLOGY, 2017, 35 (04) : 316 - 319
  • [8] Alternative cleavage and polyadenylation: extent, regulation and function
    Elkon, Ran
    Ugalde, Alejandro P.
    Agami, Reuven
    [J]. NATURE REVIEWS GENETICS, 2013, 14 (07) : 496 - 506
  • [9] APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data
    Fahmi, Naima Ahmed
    Ahmed, Khandakar Tanvir
    Chang, Jae-Woong
    Nassereddeen, Heba
    Fan, Deliang
    Yong, Jeongsik
    Zhang, Wei
    [J]. BMC BIOINFORMATICS, 2022, 23 (SUPPL 3)
  • [10] TC3A: The Cancer 3′ UTR Atlas
    Feng, Xin
    Li, Lei
    Wagner, Eric J.
    Li, Wei
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) : D1027 - D1030