Identifying candidate RNA-seq biomarkers for severity discrimination in chemical injuries: A machine learning and molecular dynamics approach

被引:0
|
作者
Arabfard, Masoud [1 ]
Behmard, Esmaeil [2 ]
Maghsoudloo, Mazaher [3 ]
Dadgar, Emad [4 ]
Parvin, Shahram [1 ]
Bagheri, Hasan [1 ]
机构
[1] Baqiyatallah Univ Med Sci, Syst Biol & Poisonings Inst, Chem Injuries Res Ctr, Tehran, Iran
[2] Fasa Univ Med Sci, Sch Adv Technol Med, Fasa, Iran
[3] Southwest Med Univ, Res Ctr Preclin Med, Key Lab Epigenet & Oncol, Luzhou 646000, Sichuan, Peoples R China
[4] Baqiyatallah Univ Med Sci, Students Res Comm, Tehran, Iran
关键词
Biomarkers; Machine Learning; RNA-Seq; Mustard Gas; Chemical injured; NEUTROPHILS; ALGORITHMS; MECHANISMS; MUSTARD; REPAIR; CXCR1;
D O I
10.1016/j.intimp.2025.114090
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Introduction: Biomarkers play a crucial role across various fields by providing insights into biological responses to interventions. High-throughput gene expression profiling technologies facilitate the discovery of data-driven biomarkers through extensive datasets. This study focuses on identifying biomarkers in gene expression data related to chemical injuries by mustard gas, covering a spectrum from healthy individuals to severe injuries. Materials and methods: The study utilized RNA-Seq data comprising 52 expression data samples for 54,583 gene transcripts. These samples were categorized into four classes based on the GOLD classification for chemically injured individuals: Severe (n = 14), Moderate (n = 11), Mild (n = 16), and healthy controls (n = 11). Data preparation involved examining an Excel file created in the R programming environment using MLSeq and devtools packages. Feature selection was performed using Genetic Algorithm and Simulated Annealing, with Random Forest algorithm employed for classification. Ab initio methods ensured computational efficiency and result accuracy, while molecular dynamics simulation acted as a virtual experiment bridging the gap between experimental and theoretical experiences. Results: A total of 12 models were created, each introducing a list of differentially expressed genes as potential biomarkers. The performance of models varied across group comparisons, with the Genetic Algorithm generally outperforming Simulated Annealing in most cases. For the Severe vs. Moderate group, GA achieved the best performance with an accuracy of 94.38%, recall of 91.64%, and specificity of 97.10%. The results highlight the effectiveness of GA in most group comparisons, while SA performed better in specific cases involving Moderate and Mild groups. These biomarkers were evaluated against the gene expression data to assess their expression changes between different groups of chemically injured individuals. Four genes were selected based on level expression for further investigation: CXCR1, EIF2B2, RAD51, and RXFP2. The expression levels of these genes were analyzed to determine their differential expression between the groups. Conclusion: This study was designed as a computational effort to identify diagnostic biomarkers in basic biological system research. Our findings proposed a list of discriminative biomarkers capable of distinguishing between different groups of chemically injured individuals. The identification of key genes highlights the potential for biomarkers to serve as indicators of chemical injury severity, warranting further investigation to validate their clinical relevance and utility in diagnosis and treatment.
引用
收藏
页数:13
相关论文
共 16 条
  • [1] Identifying novel transcript biomarkers for hepatocellular carcinoma (HCC) using RNA-Seq datasets and machine learning
    Gupta, Rajinder
    Kleinjans, Jos
    Caiment, Florian
    BMC CANCER, 2021, 21 (01)
  • [2] Identifying novel transcript biomarkers for hepatocellular carcinoma (HCC) using RNA-Seq datasets and machine learning
    Rajinder Gupta
    Jos Kleinjans
    Florian Caiment
    BMC Cancer, 21
  • [3] Predictive biomarkers for embryotoxicity: a machine learning approach to mitigating multicollinearity in RNA-Seq
    Quah, Yixian
    Jung, Soontag
    Chan, Jireh Yi-Le
    Ham, Onju
    Jeong, Ji-Seong
    Kim, Sangyun
    Kim, Woojin
    Park, Seung-Chun
    Lee, Seung-Jin
    Yu, Wook-Joon
    ARCHIVES OF TOXICOLOGY, 2024, 98 (12) : 4093 - 4105
  • [4] Integrative approach: Advancing endometrial cancer biomarkers through machine learning and network-derived RNA-Seq screening
    Suman, Shikha
    Kulshrestha, Anurag
    HUMAN GENE, 2024, 39
  • [5] An RNA-seq Based Machine Learning Approach Identifies Latent Tuberculosis Patients With an Active Tuberculosis Profile
    Estevez, Olivia
    Anibarro, Luis
    Garet, Elina
    Pallares, Angeles
    Barcia, Laura
    Calvino, Laura
    Maueia, Cremildo
    Mussa, Tufaria
    Fdez-Riverola, Florentino
    Glez-Pena, Daniel
    Reboiro-Jato, Miguel
    Lopez-Fernandez, Hugo
    Fonseca, Nuno A.
    Reljic, Rajko
    Gonzalez-Fernandez, Africa
    FRONTIERS IN IMMUNOLOGY, 2020, 11
  • [6] Predicting COVID-19 Severity Integrating RNA-Seq Data Using Machine Learning Techniques
    Bajo-Morales, Javier
    Castillo-Secilla, Daniel
    Herrera, Luis Javier
    Caba, Octavio
    Prados, Jose Carlos
    Rojas, Ignacio
    CURRENT BIOINFORMATICS, 2023, 18 (03) : 221 - 231
  • [7] Identifying Differentially Expressed Transcripts Associated with Prostate Cancer Progression using RNA-Seq and Machine Learning Techniques
    Singireddy, Siva
    Alkhateeb, Abed
    Rezaeian, Iman
    Rueda, Luis
    Cavallo-Medved, Dora
    Porter, Lisa
    2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2015, : 369 - 373
  • [8] Integrative analysis of RNA-Seq data and machine learning approaches to identify Biomarkers for Rhizoctonia solani resistance in sugar beet
    Panahi, Bahman
    Hassani, Mahdi
    Gharajeh, Nahid Hosseinzaeh
    BIOCHEMISTRY AND BIOPHYSICS REPORTS, 2025, 41
  • [9] Machine learning model for predicting Major Depressive Disorder using RNA-Seq data: optimization of classification approach
    Verma, Pragya
    Shakya, Madhvi
    COGNITIVE NEURODYNAMICS, 2022, 16 (02) : 443 - 453
  • [10] Machine learning model for predicting Major Depressive Disorder using RNA-Seq data: optimization of classification approach
    Pragya Verma
    Madhvi Shakya
    Cognitive Neurodynamics, 2022, 16 : 443 - 453