GBMPurity: A machine learning tool for estimating glioblastoma tumor purity from bulk RNA-sequencing data

被引:0
|
作者
Thomas, Morgan P. H. [1 ,2 ]
Ajaib, Shoaib [2 ]
Tanner, Georgette [2 ]
Bulpitt, Andrew J. [1 ]
Stead, Lucy F. [2 ]
机构
[1] Univ Leeds, Sch Comp Sci, Leeds, England
[2] Univ Leeds, Leeds Inst Med Res St Jamess, Leeds, England
基金
英国科研创新办公室;
关键词
deconvolution; glioblastoma; transcriptomics; tumor microenvironment; tumor purity; EVOLUTION; SUBTYPES; ATLAS;
D O I
10.1093/neuonc/noaf026
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background Glioblastoma (GBM) presents a significant clinical challenge due to its aggressive nature and extensive heterogeneity. Tumor purity, the proportion of malignant cells within a tumor, is an important covariate for understanding the disease, having direct clinical relevance or obscuring signal of the malignant portion in molecular analyses of bulk samples. However, current methods for estimating tumor purity are nonspecific and technically demanding. Therefore, we aimed to build a reliable and accessible purity estimator for GBM.Methods We developed GBMPurity, a deep learning model specifically designed to estimate the purity of IDH-wild type primary GBM from bulk RNA-sequencing (RNA-seq) data. The model was trained using simulated pseudobulk tumors of known purity from labeled single-cell data acquired from the GBmap resource. The performance of GBMPurity was evaluated and compared to several existing tools using independent datasets.Results GBMPurity outperformed existing tools, achieving a mean absolute error of 0.15 and a concordance correlation coefficient of 0.88 on validation datasets. We demonstrate the utility of GBMPurity through inference on bulk RNA-seq samples and observe reduced purity of the proneural molecular subtype relative to the classical, attributed to the increased presence of healthy brain cells.Conclusions GBMPurity provides a reliable and accessible tool for estimating tumor purity from bulk RNA-seq data, enhancing the interpretation of bulk RNA-seq data and offering valuable insights into GBM biology. To facilitate the use of this model by the wider research community, GBMPurity is available as a web-based tool at: https://gbmdeconvoluter.leeds.ac.uk/.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology
    Del Giudice, Marco
    Peirone, Serena
    Perrone, Sarah
    Priante, Francesca
    Varese, Fabiola
    Tirtei, Elisa
    Fagioli, Franca
    Cereda, Matteo
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (09)
  • [32] SNPlice: variants that modulate Intron retention from RNA-sequencing data
    Mudvari, Prakriti
    Movassagh, Mercedeh
    Kowsari, Kamran
    Seyfi, Ali
    Kokkinaki, Maria
    Edwards, Nathan J.
    Golestaneh, Nady
    Horvath, Anelia
    BIOINFORMATICS, 2015, 31 (08) : 1191 - 1198
  • [33] Normalization of RNA-Sequencing Data from Samples with Varying mRNA Levels
    Aanes, Havard
    Winata, Cecilia
    Moen, Lars F.
    Ostrup, Olga
    Mathavan, Sinnakaruppan
    Collas, Philippe
    Rognes, Torbjorn
    Alestrom, Peter
    PLOS ONE, 2014, 9 (02):
  • [34] Determining breast cancer histological grade from RNA-sequencing data
    Wang, Mei
    Klevebring, Daniel
    Lindberg, Johan
    Czene, Kamila
    Gronberg, Henrik
    Rantalainen, Mattias
    BREAST CANCER RESEARCH, 2016, 18
  • [35] Unraveling flavivirus pathogenesis: from bulk to single-cell RNA-sequencing strategies
    Kim, Doyeong
    Jeong, Seonghun
    Park, Sang-Min
    KOREAN JOURNAL OF PHYSIOLOGY & PHARMACOLOGY, 2024, 28 (05): : 403 - 411
  • [36] Determining breast cancer histological grade from RNA-sequencing data
    Mei Wang
    Daniel Klevebring
    Johan Lindberg
    Kamila Czene
    Henrik Grönberg
    Mattias Rantalainen
    Breast Cancer Research, 18
  • [37] miRquant 2.0: an Expanded Tool for Accurate Annotation and Quantification of MicroRNAs and their isomiRs from Small RNA-Sequencing Data
    Kanke, Matthew
    Baran-Gale, Jeanette
    Villanueva, Jonathan
    Sethupathy, Praveen
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2016, 13 (05) : 307
  • [38] Expression variation analysis for tumor heterogeneity in single-cell RNA-sequencing data
    Davis-Marcisak, Emily F.
    Orugunta, Pranay
    Stein-O'Brien, Genevieve
    Puram, Sidharth V.
    Torres, Evanthia Roussos
    Hopkins, Alexander
    Jaffee, Elizabeth M.
    Favorov, Alexander V.
    Afsari, Bahman
    Goff, Loyal A.
    Fertig, Elana J.
    CANCER RESEARCH, 2019, 79 (13)
  • [39] Distinguishing Rectal Cancer from Colon Cancer Based on the Support Vector Machine Method and RNA-sequencing Data
    Zhang, Yan
    Wu, Yuan
    Gong, Zi-ying
    Ye, Hai-dan
    Zhao, Xiao-kai
    Li, Jie-yi
    Zhang, Xiao-mei
    Li, Sheng
    Zhu, Wei
    Wang, Mei
    Liang, Ge-yu
    Liu, Yun
    Guan, Xin
    Zhang, Dao-yun
    Shen, Bo
    CURRENT MEDICAL SCIENCE, 2021, 41 (02) : 368 - 374
  • [40] Distinguishing Rectal Cancer from Colon Cancer Based on the Support Vector Machine Method and RNA-sequencing Data
    Yan Zhang
    Yuan Wu
    Zi-ying Gong
    Hai-dan Ye
    Xiao-kai Zhao
    Jie-yi Li
    Xiao-mei Zhang
    Sheng Li
    Wei Zhu
    Mei Wang
    Ge-yu Liang
    Yun Liu
    Xin Guan
    Dao-yun Zhang
    Bo Shen
    Current Medical Science, 2021, 41 : 368 - 374