A comprehensive workflow for optimizing RNA-seq data analysis

被引:7
|
作者
Jiang, Gao [1 ]
Zheng, Juan-Yu [1 ]
Ren, Shu-Ning [2 ]
Yin, Weilun [2 ]
Xia, Xinli [2 ]
Li, Yun [1 ]
Wang, Hou-Ling [2 ]
机构
[1] Beijing Forestry Univ, Sch Artificial Intelligence, Sch Informat Sci & Technol, Beijing 100083, Peoples R China
[2] Beijing Forestry Univ, Coll Biol Sci & Technol, Natl Engn Res Ctr Tree Breeding & Ecol Restorat, State Key Lab Tree Genet & Breeding, Beijing 100083, Peoples R China
来源
BMC GENOMICS | 2024年 / 25卷 / 01期
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
RNA-seq data; Differential gene analysis; Software comparison; DIFFERENTIAL EXPRESSION; ALIGNMENT; PROGRAM; HISAT;
D O I
10.1186/s12864-024-10414-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Current RNA-seq analysis software for RNA-seq data tends to use similar parameters across different species without considering species-specific differences. However, the suitability and accuracy of these tools may vary when analyzing data from different species, such as humans, animals, plants, fungi, and bacteria. For most laboratory researchers lacking a background in information science, determining how to construct an analysis workflow that meets their specific needs from the array of complex analytical tools available poses a significant challenge.Results By utilizing RNA-seq data from plants, animals, and fungi, it was observed that different analytical tools demonstrate some variations in performance when applied to different species. A comprehensive experiment was conducted specifically for analyzing plant pathogenic fungal data, focusing on differential gene analysis as the ultimate goal. In this study, 288 pipelines using different tools were applied to analyze five fungal RNA-seq datasets, and the performance of their results was evaluated based on simulation. This led to the establishment of a relatively universal and superior fungal RNA-seq analysis pipeline that can serve as a reference, and certain standards for selecting analysis tools were derived for reference. Additionally, we compared various tools for alternative splicing analysis. The results based on simulated data indicated that rMATS remained the optimal choice, although consideration could be given to supplementing with tools such as SpliceWiz.Conclusion The experimental results demonstrate that, in comparison to the default software parameter configurations, the analysis combination results after tuning can provide more accurate biological insights. It is beneficial to carefully select suitable analysis software based on the data, rather than indiscriminately choosing tools, in order to achieve high-quality analysis results more efficiently.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] ARMOR: An Automated Reproducible MOdular Workflow for Preprocessing and Differential Analysis of RNA-seq Data
    Orjuela, Stephany
    Huang, Ruizhu
    Hembach, Katharina M.
    Robinson, Mark D.
    Soneson, Charlotte
    G3-GENES GENOMES GENETICS, 2019, 9 (07): : 2089 - 2096
  • [2] Computational analysis of bacterial RNA-Seq data
    McClure, Ryan
    Balasubramanian, Divya
    Sun, Yan
    Bobrovskyy, Maksym
    Sumby, Paul
    Genco, Caroline A.
    Vanderpool, Carin K.
    Tjaden, Brian
    NUCLEIC ACIDS RESEARCH, 2013, 41 (14) : e140
  • [3] RseqFlow: workflows for RNA-Seq data analysis
    Wang, Ying
    Mehta, Gaurang
    Mayani, Rajiv
    Lu, Jingxi
    Souaiaia, Tade
    Chen, Yangho
    Clark, Andrew
    Yoon, Hee Jae
    Wan, Lin
    Evgrafov, Oleg V.
    Knowles, James A.
    Deelman, Ewa
    Chen, Ting
    BIOINFORMATICS, 2011, 27 (18) : 2598 - 2600
  • [4] ProkSeq for complete analysis of RNA-Seq data from prokaryotes
    Mahmud, A. K. M. Firoj
    Delhomme, Nicolas
    Nandi, Soumyadeep
    Fallman, Maria
    BIOINFORMATICS, 2021, 37 (01) : 126 - 128
  • [5] Interactive Analysis, Exploration, and Visualization of RNA-Seq Data with SeqCVIBE
    Bothos, Efthimios
    Hatzis, Pantelis
    Moulos, Panagiotis
    METHODS AND PROTOCOLS, 2022, 5 (02)
  • [6] Parametric analysis of RNA-seq expression data
    Konishi, Tomokazu
    GENES TO CELLS, 2016, 21 (06) : 639 - 647
  • [7] RNA-Seq Data Analysis: A Practical Guide for Model and Non-Model Organisms
    Pola-Sanchez, Enrique
    Hernandez-Martinez, Karen Magdalena
    Perez-Estrada, Rafael
    Selem-Mojica, Nelly
    Simpson, June
    Abraham-Juarez, Maria Jazmin
    Herrera-Estrella, Alfredo
    Villalobos-Escobedo, Jose Manuel
    CURRENT PROTOCOLS, 2024, 4 (05):
  • [8] Statistical Issues in the Analysis of ChIP-Seq and RNA-Seq Data
    Ghosh, Debashis
    Qin, Zhaohui S.
    GENES, 2010, 1 (02) : 317 - 334
  • [9] Intron Retention as a Mode for RNA-Seq Data Analysis
    Zheng, Jian-Tao
    Lin, Cui-Xiang
    Fang, Zhao-Yu
    Li, Hong-Dong
    FRONTIERS IN GENETICS, 2020, 11
  • [10] De novo assembly and analysis of RNA-seq data
    Robertson, Gordon
    Schein, Jacqueline
    Chiu, Readman
    Corbett, Richard
    Field, Matthew
    Jackman, Shaun D.
    Mungall, Karen
    Lee, Sam
    Okada, Hisanaga Mark
    Qian, Jenny Q.
    Griffith, Malachi
    Raymond, Anthony
    Thiessen, Nina
    Cezard, Timothee
    Butterfield, Yaron S.
    Newsome, Richard
    Chan, Simon K.
    She, Rong
    Varhol, Richard
    Kamoh, Baljit
    Prabhu, Anna-Liisa
    Tam, Angela
    Zhao, YongJun
    Moore, Richard A.
    Hirst, Martin
    Marra, Marco A.
    Jones, Steven J. M.
    Hoodless, Pamela A.
    Birol, Inanc
    NATURE METHODS, 2010, 7 (11) : 909 - U62