Scalable and unsupervised discovery from raw sequencing reads using SPLASH2

被引:4
作者
Kokot, Marek [1 ]
Dehghannasiri, Roozbeh [2 ,3 ]
Baharav, Tavor [4 ,6 ,7 ]
Salzman, Julia [2 ,3 ,5 ]
Deorowicz, Sebastian [1 ]
机构
[1] Silesian Tech Univ, Dept Algorithm & Software, Gliwice, Poland
[2] Stanford Univ, Dept Biomed Data Sci, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Biochem, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Elect Engn, Stanford, CA USA
[5] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[6] Eric & Wendy Schmidt Ctr, Broad Inst, Cambridge, MA USA
[7] Dana Farber Canc Inst, Dept Data Sci, Boston, MA USA
基金
美国国家科学基金会;
关键词
CANCER; PTEN;
D O I
10.1038/s41587-024-02381-2
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We introduce SPLASH2, a fast, scalable implementation of SPLASH based on an efficient k-mer counting approach for regulated sequence variation detection in massive datasets from a wide range of sequencing technologies and biological contexts. We demonstrate biological discovery by SPLASH2 in single-cell RNA sequencing (RNA-seq) data and in bulk RNA-seq data from the Cancer Cell Line Encyclopedia, including unannotated alternative splicing in cancer transcriptomes and sensitive detection of circular RNA. SPLASH2 speeds up analysis of sequence variation in massive datasets.
引用
收藏
页码:1084 / 1090
页数:21
相关论文
共 38 条
[1]   AACR Project GENIE: Powering Precision Medicine through an International Consortium [J].
Andre, Fabrice ;
Arnedos, Monica ;
Baras, Alexander S. ;
Baselga, Jose ;
Bedard, Philippe L. ;
Berger, Michael F. ;
Bierkens, Mariska ;
Calvo, Fabien ;
Cerami, Ethan ;
Chakravarty, Debyani ;
Dang, Kristen K. ;
Davidson, Nancy E. ;
Del Vecchio, Fitz Catherine ;
Dogan, Semih ;
DuBois, Raymond N. ;
Ducar, Matthew D. ;
Futreal, P. Andrew ;
Gao Jianjiong ;
Garcia, Francisco ;
Gardos, Stu ;
Gocke, Christopher D. ;
Gross, Benjamin E. ;
Guinney, Justin ;
Heins, Zachary J. ;
Hintzen, Stephanie ;
Horlings, Hugo ;
Hudecek, Jan ;
Hyman, David M. ;
Kamel-Reid, Suzanne ;
Kandoth, Cyriac ;
Kinyua, Walter ;
Kumari, Priti ;
Kundra, Ritika ;
Ladanyi, Marc ;
Lefebvre, Celine ;
LeNoue-Newton, Michele L. ;
Lepisto, Eva M. ;
Levy, Mia A. ;
Lindeman, Neal, I ;
Lindsay, James ;
Liu, David ;
Lu Zhibin ;
MacConaill, Laura E. ;
Ian, Maurer ;
Maxwell, David S. ;
Meijer, Gerrit A. ;
Meric-Bernstam, Funda ;
Micheel, Christine M. ;
Miller, Clinton ;
Mills, Gordon .
CANCER DISCOVERY, 2017, 7 (08) :818-831
[2]   OASIS: An interpretable, finite-sample valid alternative to Pearson's X2 for scientific discovery [J].
Baharav, Tavor Z. ;
Tse, David ;
Salzman, Julia .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (15)
[3]   In renal cell carcinoma the PTEN splice variant PTEN-Δ shows similar function as the tumor suppressor PTEN itself [J].
Breuksch, Ines ;
Welter, Jonas ;
Bauer, Heide-Katharina ;
Enklaar, Thorsten ;
Frees, Sebastian ;
Thueroff, Joachim W. ;
Hasenburg, Annette ;
Prawitt, Dirk ;
Brenner, Walburgis .
CELL COMMUNICATION AND SIGNALING, 2018, 16
[4]  
Chaung K., 2023, GITHUB CODE REPOSITO
[5]   SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery [J].
Chaung, Kaitlin ;
Baharav, Tavor Z. ;
Henderson, George ;
Zheludev, Ivan N. ;
Wang, Peter L. ;
Salzman, Julia .
CELL, 2023, 186 (25) :5440-5456.e26
[6]   A guide to naming eukaryotic circular RNAs [J].
Chen, Ling-Ling ;
Bindereif, Albrecht ;
Bozzoni, Irene ;
Chang, Howard Y. Y. ;
Matera, A. Gregory ;
Gorospe, Myriam ;
Hansen, Thomas B. B. ;
Kjems, Jorgen ;
Ma, Xu-Kai ;
Pek, Jun Wei ;
Rajewsky, Nikolaus ;
Salzman, Julia ;
Wilusz, Jeremy E. E. ;
Yang, Li ;
Zhao, Fangqing .
NATURE CELL BIOLOGY, 2023, 25 (01) :1-5
[7]  
Collet Y., 2023, GITHUB
[8]  
Costello J, 2015, GENOME BIOL, V16, DOI [10.1186/s13059-014-0559-z, 10.1186/s13059-015-0762-6]
[9]  
Dehghannasiri R., 2022, BIORXIV, DOI [10.1101/2022.12.06.519414, DOI 10.1101/2022.12.06.519414]
[10]   Disk-based k-mer counting on a PC [J].
Deorowicz, Sebastian ;
Debudaj-Grabysz, Agnieszka ;
Grabowski, Szymon .
BMC BIOINFORMATICS, 2013, 14