GBMPurity: A machine learning tool for estimating glioblastoma tumor purity from bulk RNA-sequencing data

被引:0
|
作者
Thomas, Morgan P. H. [1 ,2 ]
Ajaib, Shoaib [2 ]
Tanner, Georgette [2 ]
Bulpitt, Andrew J. [1 ]
Stead, Lucy F. [2 ]
机构
[1] Univ Leeds, Sch Comp Sci, Leeds, England
[2] Univ Leeds, Leeds Inst Med Res St Jamess, Leeds, England
基金
英国科研创新办公室;
关键词
deconvolution; glioblastoma; transcriptomics; tumor microenvironment; tumor purity; EVOLUTION; SUBTYPES; ATLAS;
D O I
10.1093/neuonc/noaf026
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background Glioblastoma (GBM) presents a significant clinical challenge due to its aggressive nature and extensive heterogeneity. Tumor purity, the proportion of malignant cells within a tumor, is an important covariate for understanding the disease, having direct clinical relevance or obscuring signal of the malignant portion in molecular analyses of bulk samples. However, current methods for estimating tumor purity are nonspecific and technically demanding. Therefore, we aimed to build a reliable and accessible purity estimator for GBM.Methods We developed GBMPurity, a deep learning model specifically designed to estimate the purity of IDH-wild type primary GBM from bulk RNA-sequencing (RNA-seq) data. The model was trained using simulated pseudobulk tumors of known purity from labeled single-cell data acquired from the GBmap resource. The performance of GBMPurity was evaluated and compared to several existing tools using independent datasets.Results GBMPurity outperformed existing tools, achieving a mean absolute error of 0.15 and a concordance correlation coefficient of 0.88 on validation datasets. We demonstrate the utility of GBMPurity through inference on bulk RNA-seq samples and observe reduced purity of the proneural molecular subtype relative to the classical, attributed to the increased presence of healthy brain cells.Conclusions GBMPurity provides a reliable and accessible tool for estimating tumor purity from bulk RNA-seq data, enhancing the interpretation of bulk RNA-seq data and offering valuable insights into GBM biology. To facilitate the use of this model by the wider research community, GBMPurity is available as a web-based tool at: https://gbmdeconvoluter.leeds.ac.uk/.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] MLSeq: Machine learning interface for RNA-sequencing data
    Goksuluk, Dincer
    Zararsiz, Gokmen
    Korkmaz, Selcuk
    Eldem, Vahap
    Zararsiz, Gozde Erturk
    Ozcetin, Erdener
    Ozturk, Ahmet
    Karaagaoglu, Ahmet Ergun
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2019, 175 : 223 - 231
  • [2] Identification of diagnostic markers for moyamoya disease by combining bulk RNA-sequencing analysis and machine learning
    Xu, Yifan
    Chen, Bing
    Guo, Zhongxiang
    Chen, Cheng
    Wang, Chao
    Zhou, Han
    Zhang, Chonghui
    Feng, Yugong
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [3] ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
    Labory, Justine
    Le Bideau, Gwendal
    Pratella, David
    Yao, Jean-Elisee
    Saadi, Samira Ait-El-Mkadem
    Bannwarth, Sylvie
    El-Hami, Loubna
    Paquis-Fluckinger, Veronique
    Bottini, Silvia
    BIOINFORMATICS, 2022, 38 (20) : 4754 - 4761
  • [4] Machine learning and statistical methods for clustering single-cell RNA-sequencing data
    Petegrosso, Raphael
    Li, Zhuliu
    Kuang, Rui
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (04) : 1209 - 1223
  • [5] Cancer diagnosis by machine learning-powered RNA-sequencing of tumor-educated platelets
    Berenguer, Jordi
    ONCOGENE, 2019, 38 : 8 - 8
  • [6] Combining bulk RNA-sequencing and single-cell RNA-sequencing data to reveal the immune microenvironment and metabolic pattern of osteosarcoma
    Huang, Ruichao
    Wang, Xiaohu
    Yin, Xiangyun
    Zhou, Yaqi
    Sun, Jiansheng
    Yin, Zhongxiu
    Zhu, Zhi
    FRONTIERS IN GENETICS, 2022, 13
  • [7] Tutorial: integrative computational analysis of bulk RNA-sequencing data to characterize tumor immunity using RIMA
    Yang, Lin
    Wang, Jin
    Altreuter, Jennifer
    Jhaveri, Aashna
    Wong, Cheryl J. J.
    Song, Li
    Fu, Jingxin
    Taing, Len
    Bodapati, Sudheshna
    Sahu, Avinash
    Tokheim, Collin
    Zhang, Yi
    Zeng, Zexian
    Bai, Gali
    Tang, Ming
    Qiu, Xintao
    Long, Henry W. W.
    Michor, Franziska
    Liu, Yang
    Liu, X. Shirley
    NATURE PROTOCOLS, 2023, 18 (08) : 2404 - 2414
  • [8] Tutorial: integrative computational analysis of bulk RNA-sequencing data to characterize tumor immunity using RIMA
    Lin Yang
    Jin Wang
    Jennifer Altreuter
    Aashna Jhaveri
    Cheryl J. Wong
    Li Song
    Jingxin Fu
    Len Taing
    Sudheshna Bodapati
    Avinash Sahu
    Collin Tokheim
    Yi Zhang
    Zexian Zeng
    Gali Bai
    Ming Tang
    Xintao Qiu
    Henry W. Long
    Franziska Michor
    Yang Liu
    X. Shirley Liu
    Nature Protocols, 2023, 18 : 2404 - 2414
  • [9] Estimating tumor mutational burden from RNA-sequencing without a matched-normal sample
    Rotem Katzir
    Noam Rudberg
    Keren Yizhak
    Nature Communications, 13
  • [10] Estimating tumor mutational burden from RNA-sequencing without a matched-normal sample
    Katzir, Rotem
    Rudberg, Noam
    Yizhak, Keren
    NATURE COMMUNICATIONS, 2022, 13 (01)