SMGR: a joint statistical method for integrative analysis of single-cell multi-omics data

被引:34
作者
Song, Qianqian [1 ,2 ]
Zhu, Xuewei [3 ]
Jin, Lingtao [4 ]
Chen, Minghan [5 ]
Zhang, Wei [1 ,2 ]
Su, Jing [6 ,7 ]
机构
[1] Atrium Hlth Wake Forest Baptist, Wake Forest Baptist Comprehens Canc Ctr, Ctr Canc Genom & Precis Oncol, Winston Salem, NC 27157 USA
[2] Wake Forest Sch Med, Dept Canc Biol, Winston Salem, NC 27157 USA
[3] Wake Forest Sch Med, Dept Internal Med, Sect Mol Med, Winston Salem, NC 27101 USA
[4] UT Hlth San Antonio, Dept Mol Med, San Antonio, TX 78229 USA
[5] Wake Forest Univ, Dept Comp Sci, Winston Salem, NC 27109 USA
[6] Indiana Univ Sch Med, Dept Biostat & Hlth Data Sci, Indianapolis, IN 46202 USA
[7] Wake Forest Sch Med, Sect Gerontol & Geriatr Med, Winston Salem, NC 27157 USA
关键词
VALIDATION; ACTIVATION; REVEALS; MODELS; NOISE; STATE;
D O I
10.1093/nargab/lqac056
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Unravelling the regulatory programs from single-cell multi-omics data has long been one of the major challenges in genomics, especially in the current emerging single-cell field. Currently there is a huge gap between fast-growing single-cell multi-omics data and effective methods for the integrative analysis of these inherent sparse and heterogeneous data. In this study, we have developed a novel method, Single-cell Multi-omics Gene co-Regulatory algorithm (SMGR), to detect coherent functional regulatory signals and target genes from the joint single-cell RNA-sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data obtained from different samples. Given that scRNA-seq and scATAC-seq data can be captured by zero-inflated Negative Binomial distribution, we utilize a generalized linear regression model to identify the latent representation of consistently expressed genes and peaks, thus enables the identification of co-regulatory programs and the elucidation of regulating mechanisms. Results from both simulation and experimental data demonstrate that SMGR outperforms the existing methods with considerably improved accuracy. To illustrate the biological insights of SMGR, we apply SMGR to mixed-phenotype acute leukemia (MPAL) and identify the MPAL-specific regulatory program with significant peak-gene links, which greatly enhance our understanding of the regulatory mechanisms and potential targets of this complex tumor.
引用
收藏
页数:13
相关论文
共 59 条
  • [1] Clust: automatic extraction of optimal co-expressed gene clusters from gene expression data
    Abu-Jamous, Basel
    Kelly, Steven
    [J]. GENOME BIOLOGY, 2018, 19
  • [2] Aibar S, 2017, NAT METHODS, V14, P1083, DOI [10.1038/NMETH.4463, 10.1038/nmeth.4463]
  • [3] Differential expression analysis for sequence count data
    Anders, Simon
    Huber, Wolfgang
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [4] Joint analysis of heterogeneous single-cell RNA-seq dataset collections
    Barkas, Nikolas
    Petukhov, Viktor
    Nikolaeva, Daria
    Lozinsky, Yaroslav
    Demharter, Samuel
    Khodosevich, Konstantin
    Kharchenko, Peter V.
    [J]. NATURE METHODS, 2019, 16 (08) : 695 - +
  • [5] A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure
    Baron, Maayan
    Veres, Adrian
    Wolock, Samuel L.
    Faust, Aubrey L.
    Gaujoux, Renaud
    Vetere, Amedeo
    Ryu, Jennifer Hyoje
    Wagner, Bridget K.
    Shen-Orr, Shai S.
    Klein, Allon M.
    Melton, Douglas A.
    Yanai, Itai
    [J]. CELL SYSTEMS, 2016, 3 (04) : 346 - +
  • [6] Single-cell multiomics sequencing and analyses of human colorectal cancer
    Bian, Shuhui
    Hou, Yu
    Zhou, Xin
    Li, Xianlong
    Yong, Jun
    Wang, Yicheng
    Wang, Wendong
    Yan, Jia
    Hu, Boqiang
    Guo, Hongshan
    Wang, Jilian
    Gao, Shuai
    Mao, Yunuo
    Dong, Ji
    Zhu, Ping
    Xiu, Dianrong
    Yan, Liying
    Wen, Lu
    Qiao, Jie
    Tang, Fuchou
    Fu, Wei
    [J]. SCIENCE, 2018, 362 (6418) : 1060 - +
  • [7] Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/NMETH.2645, 10.1038/nmeth.2645]
  • [8] Integrating single-cell transcriptomic data across different conditions, technologies, and species
    Butler, Andrew
    Hoffman, Paul
    Smibert, Peter
    Papalexi, Efthymia
    Satija, Rahul
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (05) : 411 - +
  • [9] Calinski T., 1974, COMMUN STAT-THEOR M, V3, P1, DOI DOI 10.1080/03610927408827101
  • [10] Comprehensive single-cell transcriptional profiling of a multicellular organism
    Cao, Junyue
    Packer, Jonathan S.
    Ramani, Vijay
    Cusanovich, Darren A.
    Huynh, Chau
    Daza, Riza
    Qiu, Xiaojie
    Lee, Choli
    Furlan, Scott N.
    Steemers, Frank J.
    Adey, Andrew
    Waterston, Robert H.
    Trapnell, Cole
    Shendure, Jay
    [J]. SCIENCE, 2017, 357 (6352) : 661 - 667