Effective Integration of Single-Cell Multi-Omics Data Using Improved Network-Based Integrative Clustering with Multigraph Regularization

被引:0
作者
Zhang, Shunqin [1 ]
Kong, Wei [1 ]
Wang, Shuaiqun [1 ]
Wei, Kai [2 ]
Liu, Kun [1 ]
Wen, Gen [3 ]
Yu, Yaling [3 ,4 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai, Peoples R China
[2] Univ Chinese Acad Sci, Chinese Acad Sci, Shanghai Inst Nutr & Hlth, CAS Key Lab Computat Biol,Biomed Big Data Ctr, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Orthoped Surg, Shanghai Peoples Hosp 6, Sch Med, Shanghai, Peoples R China
[4] Shanghai Jiao Tong Univ, Inst Microsurg Extrem, Shanghai Peoples Hosp Affiliated 6, Sch Med, Shanghai, Peoples R China
基金
上海市自然科学基金;
关键词
adaptive graph learning; data integration; graph regularization constraints; single-cell multi-omics data;
D O I
10.1089/cmb.2023.0460
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The purpose of integrating different omics data is to study cellular heterogeneity at the level of transcriptional regulation from different gene levels, which can effectively identify cell types and reveal the pathogenesis of Alzheimer's disease (AD) from two perspectives. However, implementing such algorithms faces challenges such as high data noise levels, increased dimensionality, and computational complexity. In this study, multigraph regularization constraints were introduced in the network-based integrative clustering algorithm (MGR-NIC) to remove redundant features and keep the geometry structures underlying the data by fusing two types of data (snRNA-seq and snATAC-seq) of glial cells from AD samples. The effectiveness of the MGR-NIC algorithm was validated using both simulation datasets and real datasets derived from various tissues. The MGR-NIC algorithm can improve clustering accuracy by selecting features that better represent the dataset's structure. The clustering results obtained with the MGR-NIC algorithm show strong consistency with the clustering results inherent to the published DLPFC dataset, while the classification results generated using the NIC algorithm often lead to cluster overlap when applied to the DLPFC dataset. We will use the same state-of-the-art algorithms for a comprehensive evaluation with our proposed MGR-NIC algorithm, including NIC, scAI, Multi-Omics Factor Analysis v2, and JSNMF. MGR-NIC is the most stable and reliable method, implying its robustness across different datasets and its reliability in yielding consistent and accurate results.
引用
收藏
页码:601 / 614
页数:14
相关论文
共 26 条
[1]   Single nucleus multiomics identifies ZEB1 and MAFB as candidate regulators of Alzheimer's disease-specific cis-regulatory elements [J].
Anderson, Ashlyn G. ;
Rogers, Brianne B. ;
Loupe, Jacob M. ;
Rodriguez-Nunez, Ivan ;
Brazell, J. Nicholas ;
Roberts, Sydney C. ;
White, Lauren M. ;
Bunney, William E. ;
Bunney, Blynn G. ;
Watson, Stanley J. ;
Cochran, J. Nicholas ;
Myers, Richard M. ;
Rizzardi, Lindsay F. .
CELL GENOMICS, 2023, 3 (03)
[2]   MOFA plus : a statistical framework for comprehensive integration of multi-modal single-cell data [J].
Argelaguet, Ricard ;
Arnol, Damien ;
Bredikhin, Danila ;
Deloro, Yonatan ;
Velten, Britta ;
Marioni, John C. ;
Stegle, Oliver .
GENOME BIOLOGY, 2020, 21 (01)
[3]   Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets [J].
Argelaguet, Ricard ;
Velten, Britta ;
Arnol, Damien ;
Dietrich, Sascha ;
Zenz, Thorsten ;
Marioni, John C. ;
Buettner, Florian ;
Huber, Wolfgang ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)
[4]   Graph Regularized Nonnegative Matrix Factorization for Data Representation [J].
Cai, Deng ;
He, Xiaofei ;
Han, Jiawei ;
Huang, Thomas S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) :1548-1560
[5]   Joint profiling of chromatin accessibility and gene expression in thousands of single cells [J].
Cao, Junyue ;
Cusanovich, Darren A. ;
Ramani, Vijay ;
Aghamirzaie, Delasa ;
Pliner, Hannah A. ;
Hill, Andrew J. ;
Daza, Riza M. ;
McFaline-Figueroa, Jose L. ;
Packer, Jonathan S. ;
Christiansen, Lena ;
Steemers, Frank J. ;
Adey, Andrew C. ;
Trapnell, Cole ;
Shendure, Jay .
SCIENCE, 2018, 361 (6409) :1380-1385
[6]   Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis [J].
Dai, Ling-Yun ;
Zhu, Rong ;
Wang, Juan .
COMPLEXITY, 2020, 2020
[7]   Unveiling COVID-19-associated organ-specific cell types and cell-specific pathway cascade [J].
Dey, Ashmita ;
Sen, Sagnik ;
Maulik, Ujjwal .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (02) :914-923
[8]   SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references [J].
Dong, Meichen ;
Thennavan, Aatish ;
Urrutia, Eugene ;
Li, Yun ;
Perou, Charles M. ;
Zou, Fei ;
Jiang, Yuchao .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (01) :416-427
[9]   Single-cell messenger RNA sequencing reveals rare intestinal cell types [J].
Grun, Dominic ;
Lyubimova, Anna ;
Kester, Lennart ;
Wiebrands, Kay ;
Basak, Onur ;
Sasaki, Nobuo ;
Clevers, Hans ;
van Oudenaarden, Alexander .
NATURE, 2015, 525 (7568) :251-+
[10]   Integrated analysis of multimodal single-cell data [J].
Hao, Yuhan ;
Hao, Stephanie ;
Andersen-Nissen, Erica ;
Mauck, William M. I. I. I. I. I. I. ;
Zheng, Shiwei ;
Butler, Andrew ;
Lee, Maddie J. ;
Wilk, Aaron J. ;
Darby, Charlotte ;
Zager, Michael ;
Hoffman, Paul ;
Stoeckius, Marlon ;
Papalexi, Efthymia ;
Mimitou, Eleni P. ;
Jain, Jaison ;
Srivastava, Avi ;
Stuart, Tim ;
Fleming, Lamar M. ;
Yeung, Bertrand ;
Rogers, Angela J. ;
McElrath, Juliana M. ;
Blish, Catherine A. ;
Gottardo, Raphael ;
Smibert, Peter ;
Satija, Rahul .
CELL, 2021, 184 (13) :3573-+