NASA GeneLab RNA-seq consensus pipeline: standardized processing of short-read RNA-seq data

被引:28
作者
Overbey, Eliah G. [1 ]
Saravia-Butler, Amanda M. [2 ,3 ]
Zhang, Zhe [4 ]
Rathi, Komal S. [4 ]
Fogle, Homer [3 ,5 ]
da Silveira, Willian A. [6 ,7 ]
Barker, Richard J. [8 ]
Bass, Joseph J. [9 ,10 ]
Beheshti, Afshin [41 ,42 ]
Berrios, Daniel C. [3 ]
Blaber, Elizabeth A. [11 ]
Cekanaviciute, Egle [3 ]
Costa, Helio A. [12 ,13 ]
Davin, Laurence B. [14 ]
Fisch, Kathleen M. [15 ]
Gebre, Samrawit G. [3 ,41 ]
Geniza, Matthew [16 ]
Gilbert, Rachel [17 ]
Gilroy, Simon [8 ]
Hardiman, Gary [6 ,7 ,18 ]
Herranz, Raul [19 ]
Kidane, Yared H. [20 ]
Kruse, Colin P. S. [21 ]
Lee, Michael D. [22 ,23 ]
Liefeld, Ted [24 ]
Lewis, Norman G. [14 ]
McDonald, J. Tyson [25 ]
Meller, Robert [26 ]
Mishra, Tejaswini [27 ]
Perera, Imara Y. [28 ]
Ray, Shayoni [29 ]
Reinsch, Sigrid S. [3 ]
Rosenthal, Sara Brin [15 ]
Strong, Michael [30 ]
Szewczyk, Nathaniel J. [31 ,32 ]
Tahimic, Candice G. T. [33 ]
Taylor, Deanne M. [4 ,34 ]
Vandenbrink, Joshua P. [35 ]
Villacampa, Alicia [19 ]
Weging, Silvio [36 ]
Wolverton, Chris [37 ]
Wyatt, Sarah E. [38 ,39 ]
Zea, Luis [40 ]
Costes, Sylvain, V [3 ]
Galazka, Jonathan M. [3 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Logyx LLC, Mountain View, CA 94043 USA
[3] NASA, Space Biosci Div, Ames Res Ctr, Moffett Field, CA 94035 USA
[4] Univ Penn, Childrens Hosp Philadelphia, Dept Biomed & Hlth Informat, Philadelphia, PA 19104 USA
[5] NASA, Bionet Corp, Ames Res Ctr, Moffett Field, CA 94035 USA
[6] Queens Univ Belfast, Inst Global Food Secur IGFS, Belfast, Antrim, North Ireland
[7] Queens Univ Belfast, Sch Biol Sci, Belfast, Antrim, North Ireland
[8] Univ Wisconsin, Dept Bot, Madison, WI 53706 USA
[9] Univ Nottingham, Royal Derby Hosp, MRC Versus Arthrit Ctr Musculoskeletal Ageing Res, Derby DE22 3DT, England
[10] Nottingham Biomed Res Ctr, Natl Inst Hlth Res, Derby DE22 3DT, England
[11] Rensselaer Polytech Inst, Dept Biomed Engn, Ctr Biotechnol & Interdisciplinary Studies, Troy, NY 12180 USA
[12] Stanford Univ, Sch Med, Dept Pathol, Stanford, CA 94305 USA
[13] Stanford Univ, Sch Med, Dept Biomed Data Sci, Stanford, CA 94305 USA
[14] Washington State Univ, Inst Biol Chem, Pullman, WA 99164 USA
[15] Univ Calif San Diego, Ctr Computat Biol & Bioinformat, Dept Med, La Jolla, CA 92093 USA
[16] Phylos Biosci, Portland, OR 97214 USA
[17] Univ Space Res Assoc, NASA Postdoctoral Program, NASA, Ames Res Ctr, Moffett Field, CA 94035 USA
[18] Med Univ South Carolina, Charleston, SC 29425 USA
[19] Ctr Invest Biol Margarita Salas CSIC, Ramiro de Maeztu 9, Madrid 28040, Spain
[20] Texas Scottish Rite Hosp Children, Ctr Pediat Bone Biol & Translat Res, 2222 Welborn St, Dallas, TX 75219 USA
[21] Los Alamos Natl Lab, Biosci Div, Los Alamos, NM 87545 USA
[22] NASA, Exobiol Branch, Ames Res Ctr, Mountain View, CA 94035 USA
[23] Blue Marble Space Inst Sci, Seattle, WA 98154 USA
[24] Univ Calif San Diego, Dept Med, San Diego, CA 92093 USA
[25] Georgetown Univ, Dept Radiat Med, Med Ctr, Washington, DC 20007 USA
[26] Morehouse Sch Med, Dept Neurobiol & Pharmacol, Atlanta, GA 30310 USA
[27] Stanford Univ, Dept Genet, Sch Med, Stanford, CA 94305 USA
[28] North Carolina State Univ, Dept Plant & Microbial Biol, Raleigh, NC 27695 USA
[29] NGM Biopharmaceut, San Francisco, CA 94080 USA
[30] Natl Jewish Hlth, Ctr Genes Environm & Hlth, 1400 Jackson St, Denver, CO 80206 USA
[31] Ohio Univ, Ohio Musculoskeletal & Neurol Inst, Athens, OH 43147 USA
[32] Ohio Univ, Dept Biomed Sci, Athens, OH 43147 USA
[33] Univ North Florida, Dept Biol, Jacksonville, FL 32224 USA
[34] Univ Penn, Perelman Sch Med, Dept Pediat, Philadelphia, PA 19104 USA
[35] Louisiana Tech Univ, Dept Biol, Ruston, LA 71272 USA
[36] Martin Luther Univ Halle Wittenberg, Inst Comp Sci, Von Seckendorff Pl 1, D-06120 Halle, Germany
[37] Ohio Wesleyan Univ, Dept Bot & Microbiol, Delaware, OH 43015 USA
[38] Ohio Univ, Dept Environm & Plant Biol, Athens, OH 45701 USA
[39] Ohio Univ, Interdisciplinary Program Mol & Cellular Biol, Athens, OH 45701 USA
[40] Univ Colorado, Aerosp Engn Sci Dept, BioServe Space Technol, Boulder, CO 80303 USA
[41] NASA, KBR, Ames Res Ctr, Moffett Field, CA 94035 USA
[42] Broad Inst MIT & Harvard, Stanley Ctr Psychiat Res, Cambridge, MA 02142 USA
基金
英国生物技术与生命科学研究理事会;
关键词
QUANTIFICATION;
D O I
10.1016/j.isci.2021.102361
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the development of transcriptomic technologies, we are able to quantify precise changes in gene expression profiles from astronauts and other organisms exposed to spaceflight. Members of NASA GeneLab and GeneLab-associated analysis working groups (AWGs) have developed a consensus pipeline for analyzing short-read RNA-sequencing data from spaceflight-associated experiments. The pipeline includes quality control, read trimming, mapping, and gene quantification steps, culminating in the detection of differentially expressed genes. This data analysis pipeline and the results of its execution using data submitted to GeneLab are now all publicly available through the GeneLab database. We present here the full details and rationale for the construction of this pipeline in order to promote transparency, reproducibility, and reusability of pipeline data; to provide a template for data processing of future spaceflight-relevant datasets; and to encourage cross-analysis of data from other databases with the data available in GeneLab.
引用
收藏
页数:21
相关论文
共 39 条
[1]  
Baruzzo G, 2017, NAT METHODS, V14, P135, DOI [10.1038/NMETH.4106, 10.1038/nmeth.4106]
[2]   NASA GeneLab: interfaces for the exploration of space omics data [J].
Berrios, Daniel C. ;
Galazka, Jonathan ;
Grigorev, Kirill ;
Gebre, Samrawit ;
Costes, Sylvain, V .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D1515-D1522
[3]   Near-optimal probabilistic RNA-seq quantification (vol 34, pg 525, 2016) [J].
Bray, Nicolas L. ;
Pimentel, Harold ;
Melsted, Pall ;
Pachter, Lior .
NATURE BIOTECHNOLOGY, 2016, 34 (08) :888-888
[4]   Nanopore DNA Sequencing and Genome Assembly on the International Space Station [J].
Castro-Wallace, Sarah L. ;
Chiu, Charles Y. ;
John, Kristen K. ;
Stahl, Sarah E. ;
Rubins, Kathleen H. ;
McIntyre, Alexa B. R. ;
Dworkin, Jason P. ;
Lupisella, Mark L. ;
Smith, David J. ;
Botkin, Douglas J. ;
Stephenson, Timothy A. ;
Juul, Sissel ;
Turner, Daniel J. ;
Izquierdo, Fernando ;
Federman, Scot ;
Stryke, Doug ;
Somasekar, Sneha ;
Alexander, Noah ;
Yu, Guixia ;
Mason, Christopher E. ;
Burton, Aaron S. .
SCIENTIFIC REPORTS, 2017, 7
[5]   ToppGene Suite for gene list enrichment analysis and candidate gene prioritization [J].
Chen, Jing ;
Bardes, Eric E. ;
Aronow, Bruce J. ;
Jegga, Anil G. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W305-W311
[6]   A survey of best practices for RNA-seq data analysis [J].
Conesa, Ana ;
Madrigal, Pedro ;
Tarazona, Sonia ;
Gomez-Cabrero, David ;
Cervera, Alejandra ;
McPherson, Andrew ;
Szczesniak, Michal Wojciech ;
Gaffney, Daniel J. ;
Elo, Laura L. ;
Zhang, Xuegong ;
Mortazavi, Ali .
GENOME BIOLOGY, 2016, 17
[7]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[8]   MultiQC: summarize analysis results for multiple tools and samples in a single report [J].
Ewels, Philip ;
Magnusson, Mans ;
Lundin, Sverker ;
Kaller, Max .
BIOINFORMATICS, 2016, 32 (19) :3047-3048
[9]  
Functional Genomics Data Society, 2012, MINSEQE MIN INF HIGH
[10]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)