Volatile Organic Compounds for the Prediction of Lung Cancer by Using Ensembled Machine Learning Model and Feature Selection

被引:1
|
作者
Khanna, Divya [1 ]
Kumar, Arun [2 ]
Bhat, Shahid Ahmad [3 ]
机构
[1] Chitkara Univ, Inst Engn & Technol, Rajpura 140401, Punjab, India
[2] Madhav Inst Sci & Technol, Ctr Artificial Intelligence, Gwalior 474005, Madhya Pradesh, India
[3] LUT Univ, LUT Business Sch, Lappeenranta 53851, Finland
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Lung cancer; Cancer; Predictive models; Volatile organic compounds; Machine learning; Lungs; Feature extraction; Analytical models; Support vector machines; Biomarkers; VOCs; lung cancer; biomarkers; machine learning models; ensemble model; ensemble feature selection approach; B-CELL EPITOPES; ALLERGENIC PROTEINS; CLASSIFICATION; BIOMARKERS; LOCATION; DISEASE; SCENT;
D O I
10.1109/ACCESS.2025.3527027
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advancement of biomarkers is critically important at present, as lung cancer is a leading cause of death. In the present study, volatile organic compounds (VOCs) are considered as biomarkers to predict lung cancer. VOCs from seven different sources including breath, blood, urine, cell line, plerual fluid, cancer tissue and lung tissue are targeted to enhance the prediction reliability. Feature selection and models fusion have been focused on during this study. Five in-built and one proposed ensemble machine learning model have been utilised to investigate the different types of VOCs. The idea behind designing one ensemble model is to combine multiple individual models for better performance by using optimal feature sets. This reasoning led to the design of an ensemble model to predict breath VOCs. The AvNNet model has superior performance in predicting blood VOCs, cancer tissue VOCs, cell line VOCs, and urine VOCs compared to four other models, achieving accuracies of 70%, 80%, 70%, and 90% accordingly on the validation dataset. The Blackboost model achieved 90% accuracy on the validation dataset in its prediction of lung tissue VOCs. With 90% accuracy on a validation dataset, the random forest model predicts pleural fluid volatile organic compounds efficiently. When compared to individual models, the proposed ensemble model predicts breath VOCs more effectively and achieves 100% accuracy on the validation dataset.
引用
收藏
页码:9809 / 9820
页数:12
相关论文
共 50 条
  • [1] Exploring Volatile Organic Compounds in Breath for High-Accuracy Prediction of Lung Cancer
    Tsou, Ping-Hsien
    Lin, Zong-Lin
    Pan, Yu-Chiang
    Yang, Hui-Chen
    Chang, Chien-Jen
    Liang, Sheng-Kai
    Wen, Yueh-Feng
    Chang, Chia-Hao
    Chang, Lih-Yu
    Yu, Kai-Lun
    Liu, Chia-Jung
    Keng, Li-Ta
    Lee, Meng-Rui
    Ko, Jen-Chung
    Huang, Guan-Hua
    Li, Yaw-Kuen
    CANCERS, 2021, 13 (06) : 1 - 14
  • [2] Rapid recognition of volatile organic compounds with colorimetric sensor arrays for lung cancer screening
    Zhong, Xianhua
    Li, Dan
    Du, Wei
    Yan, Mengqiu
    Wang, You
    Huo, Danqun
    Hou, Changjun
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2018, 410 (16) : 3671 - 3681
  • [3] Diagnosis by Volatile Organic Compounds in Exhaled Breath from Lung Cancer Patients Using Support Vector Machine Algorithm
    Sakumura, Yuichi
    Koyama, Yutaro
    Tokutake, Hiroaki
    Hida, Toyoaki
    Sato, Kazuo
    Itoh, Toshio
    Akamatsu, Takafumi
    Shin, Woosuck
    SENSORS, 2017, 17 (02)
  • [4] Machine Learning and Feature Selection Methods for EGFR Mutation Status Prediction in Lung Cancer
    Morgado, Joana
    Pereira, Tania
    Silva, Francisco
    Freitas, Claudia
    Negrao, Eduardo
    de Lima, Beatriz Flor
    da Silva, Miguel Correia
    Madureira, Antonio J.
    Ramos, Isabel
    Hespanhol, Venceslau
    Costa, Jose Luis
    Cunha, Antonio
    Oliveira, Helder P.
    APPLIED SCIENCES-BASEL, 2021, 11 (07):
  • [5] Multiomics-Based Feature Extraction and Selection for the Prediction of Lung Cancer Survival
    Jaksik, Roman
    Szumala, Kamila
    Dinh, Khanh Ngoc
    Smieja, Jaroslaw
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (07)
  • [6] Analysis of volatile organic compounds in exhaled breath for lung cancer diagnosis using a sensor system
    Chang, Ji-Eun
    Lee, Dae-Sik
    Ban, Sang-Woo
    Oh, Jaeho
    Jung, Moon Youn
    Kim, Seung-Hwan
    Park, SungJoon
    Persaud, Krishna
    Jheon, Sanghoon
    SENSORS AND ACTUATORS B-CHEMICAL, 2018, 255 : 800 - 807
  • [7] Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
    Delzell, Darcie A. P.
    Magnuson, Sara
    Peter, Tabitha
    Smith, Michelle
    Smith, Brian J.
    FRONTIERS IN ONCOLOGY, 2019, 9
  • [8] Quantitative breath analysis of volatile organic compounds of lung cancer patients
    Song, Geng
    Qin, Tao
    Liu, Hu
    Xu, Guo-Bing
    Pan, Yue-Yin
    Xiong, Fu-Xing
    Gu, Kang-Sheng
    Sun, Guo-Ping
    Chen, Zhen-Dong
    LUNG CANCER, 2010, 67 (02) : 227 - 231
  • [9] Volatile organic compounds in human matrices as lung cancer biomarkers: a systematic review
    Janssens, Eline
    van Meerbeeck, Jan P.
    Lamote, Kevin
    CRITICAL REVIEWS IN ONCOLOGY HEMATOLOGY, 2020, 153
  • [10] A study on volatile organic compounds emitted by in-vitro lung cancer cultured cells using gas sensor array and SPME-GCMS
    Thriumani, Reena
    Zakaria, Ammar
    Hashim, Yumi Zuhanis Has-Yun
    Jeffree, Amanina Iymia
    Helmy, Khaled Mohamed
    Kamarudin, Latifah Munirah
    Omar, Mohammad Iqbal
    Shakaff, Ali Yeon Md
    Adom, Abdul Hamid
    Persaud, Krishna C.
    BMC CANCER, 2018, 18