Predicting the Progression from Asymptomatic to Symptomatic Multiple Myeloma and Stage Classification Using Gene Expression Data

被引:0
|
作者
Karathanasis, Nestoras [1 ]
Spyrou, George M. [1 ]
机构
[1] Cyprus Inst Neurol & Genet, Bioinformat Dept, 6 Iroon Ave, CY-2371 Nicosia, Cyprus
关键词
multiple myeloma; cancer; gammopathies; progression; machine learning; MONOCLONAL GAMMOPATHY; UNDETERMINED SIGNIFICANCE; LONG-TERM; RISK; ABNORMALITIES; PREVALENCE; PROGNOSIS; CRITERIA; MODELS;
D O I
10.3390/cancers17020332
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background: The accurate staging of multiple myeloma (MM) is essential for optimizing treatment strategies, while predicting the progression of asymptomatic patients, also referred to as monoclonal gammopathy of undetermined significance (MGUS), to symptomatic MM remains a significant challenge due to limited data. This study aimed to develop machine learning models to enhance MM staging accuracy and stratify asymptomatic patients by their risk of progression. Methods: We utilized gene expression microarray datasets to develop machine learning models, combined with various data transformations. For multiple myeloma staging, models were trained on a single dataset and validated across five independent datasets, with performance evaluated using multiclass area under the curve (AUC) metrics. To predict progression in asymptomatic patients, we employed two approaches: (1) training models on a dataset comprising asymptomatic patients who either progressed or remained stable without progressing to multiple myeloma, and (2) training models on multiple datasets combining asymptomatic and multiple myeloma samples and then testing their ability to distinguish between asymptomatic and asymptomatic that progressed. We performed feature selection and enrichment analyses to identify key signaling pathways underlying disease stages and progression. Results: Multiple myeloma staging models demonstrated high efficacy, with ElasticNet achieving consistent multiclass AUC values of 0.9 across datasets and transformations, demonstrating robust generalizability. For asymptomatic progression, both modeling approaches yielded similar results, with AUC values exceeding 0.8 across datasets and algorithms (ElasticNet, Boosting, and Support Vector Machines), underscoring their potential in identifying progression risk. Enrichment analyses revealed key pathways, including PI3K-Akt, MAPK, Wnt, and mTOR, as central to MM pathogenesis. Conclusions: To the best of our knowledge, this is the first study to utilize gene expression datasets for classifying patients across different stages of multiple myeloma and to integrate multiple myeloma with asymptomatic cases to predict disease progression, offering a novel methodology with potential clinical applications in patient monitoring and early intervention.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] RPCA-Based Tumor Classification Using Gene Expression Data
    Liu, Jin-Xing
    Xu, Yong
    Zheng, Chun-Hou
    Kong, Heng
    Lai, Zhi-Hui
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (04) : 964 - 970
  • [42] Deep Learning Based Tumor Type Classification Using Gene Expression Data
    Lyu, Boyu
    Haque, Anamul
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 89 - 96
  • [43] Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review
    Alharbi, Fadi
    Vakanski, Aleksandar
    BIOENGINEERING-BASEL, 2023, 10 (02):
  • [44] Prediction of tumor purity from gene expression data using machine learning
    Koo, Bonil
    Rhee, Je-Keun
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [45] Deep learning techniques for cancer classification using microarray gene expression data
    Gupta, Surbhi
    Gupta, Manoj K.
    Shabaz, Mohammad
    Sharma, Ashutosh
    FRONTIERS IN PHYSIOLOGY, 2022, 13
  • [46] A Discriminative Feature Extraction Approach for Tumor Classification Using Gene Expression Data
    Mei, Qinglin
    Zhang, Huaxiang
    Liang, Cheng
    CURRENT BIOINFORMATICS, 2016, 11 (05) : 561 - 570
  • [47] A Comparative Study of Two Multiple Classification Methods Based on Partial Least Squares Using Tumor Microarray Gene Expression Data
    Jin Zhichao
    Gao Qingbin
    He Jia
    COMPREHENSIVE EVALUATION OF ECONOMY AND SOCIETY WITH STATISTICAL SCIENCE, 2009, : 1212 - 1222
  • [48] Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory
    Cleofas-Sanchez, Laura
    Salvador Sanchez, J.
    Garcia, Vicente
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2019, 8 (01) : 63 - 71
  • [49] Classification and sparse-signature extraction from gene-expression data
    Pagnani, Andrea
    Tria, Francesca
    Weigt, Martin
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2009,
  • [50] Single cell RNA-seq data and bulk gene profiles reveal a novel signature of disease progression in multiple myeloma
    Zeng, Zhiyong
    Lin, Junfang
    Zhang, Kejie
    Guo, Xizhe
    Zheng, Xiaoqiang
    Yang, Apeng
    Chen, Junmin
    CANCER CELL INTERNATIONAL, 2021, 21 (01)