Chord: an ensemble machine learning algorithm to identify doublets in single-cell RNA sequencing data

被引:6
|
作者
Xiong, Ke-Xu [1 ,2 ]
Zhou, Han-Lin [2 ,3 ,4 ,5 ,6 ,7 ]
Lin, Cong [2 ,5 ,6 ,8 ]
Yin, Jian-Hua [2 ,5 ,6 ,8 ]
Kristiansen, Karsten [2 ,7 ]
Yang, Huan-Ming [2 ,9 ]
Li, Gui-Bo [2 ,3 ,4 ,5 ,6 ,8 ]
机构
[1] Univ Chinese Acad Sci, Coll Life Sci, Beijing 100049, Peoples R China
[2] BGI Shenzhen, Shenzhen 518083, Peoples R China
[3] Zhengzhou Univ, BGI Coll, Zhengzhou, Peoples R China
[4] Zhengzhou Univ, Henan Inst Med & Pharmaceut Sci, Zhengzhou, Peoples R China
[5] BGI Shenzhen, BGI Henan, Xinxiang 453000, Henan, Peoples R China
[6] BGI Shenzhen, Shenzhen Key Lab Genom, Guangdong Prov Key Lab Human Dis Genom, Shenzhen 518083, Peoples R China
[7] Univ Copenhagen, Dept Biol, Lab Genom & Mol Biomed, DK-2100 Copenhagen, Denmark
[8] BGI Shenzhen, Shenzhen Key Lab Single Cell Omics, Shenzhen 518083, Peoples R China
[9] James D Watson Inst Genome Sci, Hangzhou 310008, Peoples R China
关键词
SEQ;
D O I
10.1038/s42003-022-03476-9
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
For the unmet need to choose the suitable doublet detection method, an ensemble machine learning algorithm called Chord was developed, which integrates multiple methods and achieves higher accuracy and stability on different scRNA-seq datasets. High-throughput single-cell RNA sequencing (scRNA-seq) is a popular method, but it is accompanied by doublet rate problems that disturb the downstream analysis. Several computational approaches have been developed to detect doublets. However, most of these methods may yield satisfactory performance in some datasets but lack stability in others; thus, it is difficult to regard a single method as the gold standard which can be applied to all types of scenarios. It is a difficult and time-consuming task for researchers to choose the most appropriate software. We here propose Chord which implements a machine learning algorithm that integrates multiple doublet detection methods to address these issues. Chord had higher accuracy and stability than the individual approaches on different datasets containing real and synthetic data. Moreover, Chord was designed with a modular architecture port, which has high flexibility and adaptability to the incorporation of any new tools. Chord is a general solution to the doublet detection problem.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Identification of novel biomarkers for atherosclerosis using single-cell RNA sequencing and machine learning
    Yong, Xi
    Kang, Tengyao
    Li, Mingzhu
    Li, Sixuan
    Yan, Xiang
    Li, Jiuxin
    Lin, Jie
    Lu, Bo
    Zheng, Jianghua
    Xu, Zhengmin
    Yang, Qin
    Li, Jingdong
    MAMMALIAN GENOME, 2025, 36 (01) : 183 - 199
  • [22] Single-Cell RNA Sequencing Data Clustering by Low-Rank Subspace Ensemble Framework
    Wang, ChuanYuan
    Gao, Ying-Lian
    Liu, Jin-Xing
    Kong, Xiong-Zhen
    Zheng, Chun-Hou
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (02) : 1154 - 1164
  • [23] jS']jSRC: a flexible and accurate joint learning algorithm for clustering of single-cell RNA-sequencing data
    Wu, Wenming
    Liu, Zaiyi
    Ma, Xiaoke
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [24] Complex Analysis of Single-Cell RNA Sequencing Data
    Khozyainova, Anna A. A.
    Valyaeva, Anna A. A.
    Arbatsky, Mikhail S. S.
    Isaev, Sergey V. V.
    Iamshchikov, Pavel S. S.
    Volchkov, Egor V. V.
    Sabirov, Marat S. S.
    Zainullina, Viktoria R. R.
    Chechekhin, Vadim I. I.
    Vorobev, Rostislav S. S.
    Menyailo, Maxim E. E.
    Tyurin-Kuzmin, Pyotr A. A.
    Denisov, Evgeny V. V.
    BIOCHEMISTRY-MOSCOW, 2023, 88 (02) : 231 - 252
  • [25] Splatter: simulation of single-cell RNA sequencing data
    Zappia, Luke
    Phipson, Belinda
    Oshlack, Alicia
    GENOME BIOLOGY, 2017, 18
  • [26] Complex Analysis of Single-Cell RNA Sequencing Data
    Anna A. Khozyainova
    Anna A. Valyaeva
    Mikhail S. Arbatsky
    Sergey V. Isaev
    Pavel S. Iamshchikov
    Egor V. Volchkov
    Marat S. Sabirov
    Viktoria R. Zainullina
    Vadim I. Chechekhin
    Rostislav S. Vorobev
    Maxim E. Menyailo
    Pyotr A. Tyurin-Kuzmin
    Evgeny V. Denisov
    Biochemistry (Moscow), 2023, 88 : 231 - 252
  • [27] Identify, quantify and characterize cellular communication from single-cell RNA sequencing data with scSeqComm
    Baruzzo, Giacomo
    Cesaro, Giulia
    Di Camillo, Barbara
    BIOINFORMATICS, 2022, 38 (07) : 1920 - 1929
  • [28] Splatter: simulation of single-cell RNA sequencing data
    Luke Zappia
    Belinda Phipson
    Alicia Oshlack
    Genome Biology, 18
  • [29] SINGLE-CELL RNA SEQUENCING TO IDENTIFY PREDICTORS OF MYCOPHENOLATE RESPONSE.
    Collins, K.
    Cheng, Y.
    Eadon, M.
    Dagher, P.
    Gao, H.
    Ferreira, R.
    CLINICAL PHARMACOLOGY & THERAPEUTICS, 2020, 107 : S39 - S40
  • [30] A Fusion Learning Model Based on Deep Learning for Single-Cell RNA Sequencing Data Clustering
    Qiao, Tian-Jing
    Li, Feng
    Yuan, Sha-Sha
    Dai, Ling-Yun
    Wang, Juan
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2024, 31 (06) : 576 - 588