A deconvolution method and its application in analyzing the cellular fractions in acute myeloid leukemia samples

被引:13
作者
Li, Huamei [1 ]
Sharma, Amit [2 ]
Ming, Wenglong [1 ]
Sun, Xiao [1 ]
Liu, Hongde [1 ]
机构
[1] Southeast Univ, Sch Biol Sci & Med Engn, State Key Lab Bioelect, Nanjing 210096, Peoples R China
[2] Univ Hosp Bonn, Dept Ophthalmol, D-53127 Bonn, Germany
基金
中国国家自然科学基金;
关键词
Marker genes; Cellular fractions; Deconvolution; Acute myeloid leukemia; Subgroups; Diagnosis; Prognostic; GENE-EXPRESSION; MUTATIONS;
D O I
10.1186/s12864-020-06888-1
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundThe identification of cell type-specific genes (markers) is an essential step for the deconvolution of the cellular fractions, primarily, from the gene expression data of a bulk sample. However, the genes with significant changes identified by pair-wise comparisons cannot indeed represent the specificity of gene expression across multiple conditions. In addition, the knowledge about the identification of gene expression markers across multiple conditions is still paucity.ResultsHerein, we developed a hybrid tool, LinDeconSeq, which consists of 1) identifying marker genes using specificity scoring and mutual linearity strategies across any number of cell types, and 2) predicting cellular fractions of bulk samples using weighted robust linear regression with the marker genes identified in the first stage. On multiple publicly available datasets, the marker genes identified by LinDeconSeq demonstrated better accuracy and reproducibility compared to MGFM and RNentropy. Among deconvolution methods, LinDeconSeq showed low average deviations (<= 0.0958) and high average Pearson correlations (>= 0.8792) between the predicted and actual fractions on the benchmark datasets. Importantly, the cellular fractions predicted by LinDeconSeq appear to be relevant in the diagnosis of acute myeloid leukemia (AML). The distinct cellular fractions in granulocyte-monocyte progenitor (GMP), lymphoid-primed multipotent progenitor (LMPP) and monocytes (MONO) were found to be closely associated with AML compared to the healthy samples. Moreover, the heterogeneity of cellular fractions in AML patients divided these patients into two subgroups, differing in both prognosis and mutation patterns. GMP fraction was the most pronounced between these two subgroups, particularly, in SubgroupA, which was strongly associated with the better AML prognosis and the younger population. Totally, the identification of marker genes by LinDeconSeq represents the improved feature for deconvolution. The data processing strategy with regard to the cellular fractions used in this study also showed potential for the diagnosis and prognosis of diseases.ConclusionsTaken together, we developed a freely-available and open-source tool LinDeconSeq (https://github.com/lihuamei/LinDeconSeq), which includes marker identification and deconvolution procedures. LinDeconSeq is comparable to other current methods in terms of accuracy when applied to benchmark datasets and has broad application in clinical outcome and disease-specific molecular mechanisms.
引用
收藏
页数:15
相关论文
共 41 条
[1]   Deconvolution of Blood Microarray Data Identifies Cellular Activation Patterns in Systemic Lupus Erythematosus [J].
Abbas, Alexander R. ;
Wolslegel, Kristen ;
Seshasayee, Dhaya ;
Modrusan, Zora ;
Clark, Hilary F. .
PLOS ONE, 2009, 4 (07)
[2]   RUNX1 Mutations in Inherited and Sporadic Leukemia [J].
Bellissimo, Dana C. ;
Speck, Nancy A. .
FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2017, 5
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]   PROPOSALS FOR CLASSIFICATION OF ACUTE LEUKEMIAS [J].
BENNETT, JM ;
CATOVSKY, D ;
DANIEL, MT ;
FLANDRIN, G ;
GALTON, DAG ;
GRALNICK, HR ;
SULTAN, C .
BRITISH JOURNAL OF HAEMATOLOGY, 1976, 33 (04) :451-&
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Computational deconvolution of transcriptomics data from mixed cell populations [J].
Cobos, Francisco Avila ;
Vandesompele, Jo ;
Mestdagh, Pieter ;
De Preter, Katleen .
BIOINFORMATICS, 2018, 34 (11) :1969-1979
[7]   Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution [J].
Corces, M. Ryan ;
Buenrostro, Jason D. ;
Wu, Beijing ;
Greenside, Peyton G. ;
Chan, Steven M. ;
Koenig, Julie L. ;
Snyder, Michael P. ;
Pritchard, Jonathan K. ;
Kundaje, Anshul ;
Gkeenleaf, William J. ;
Majeti, Ravindra ;
Chang, Howard Y. .
NATURE GENETICS, 2016, 48 (10) :1193-1203
[8]   Mutations in AML: prognostic and therapeutic implications [J].
DiNardo, Courtney D. ;
Cortes, Jorge E. .
HEMATOLOGY-AMERICAN SOCIETY OF HEMATOLOGY EDUCATION PROGRAM, 2016, :348-355
[9]   Detection of condition-specific marker genes from RNA-seq data with MGFR [J].
El Amrani, Khadija ;
Alanis-Lobato, Gregorio ;
Mah, Nancy ;
Kurtz, Andreas ;
Andrade-Navarro, Miguel A. .
PEERJ, 2019, 7
[10]   MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data [J].
El Amrani, Khadija ;
Stachelscheid, Harald ;
Lekschas, Fritz ;
Kurtz, Andreas ;
Andrade-Navarro, Miguel A. .
BMC GENOMICS, 2015, 16