A web-based tool for cancer risk prediction for middle-aged and elderly adults using machine learning algorithms and self-reported questions

被引:0
作者
Xiao, Xingjian [1 ]
Yi, Xiaohan [1 ]
Soe, Nyi Nyi [2 ,3 ]
Latt, Phyu Mon [2 ,3 ]
Lin, Luotao [4 ]
Chen, Xuefen [1 ]
Song, Hualing [1 ]
Sun, Bo [5 ]
Zhao, Hailei [1 ]
Xu, Xianglong [1 ,2 ,3 ,6 ,7 ]
机构
[1] Shanghai Univ Tradit Chinese Med, Sch Publ Hlth, Shanghai, Peoples R China
[2] Monash Univ, Fac Med Nursing & Hlth Sci, Sch Translat Med, Clayton, Vic, Australia
[3] Alfred Hlth, Melbourne Sexual Hlth Ctr, Artificial Intelligence & Modelling Epidemiol Prog, Carlton, Vic, Australia
[4] Univ New Mexico, Dept Individual Family & Community Educ, Nutr & Dietet Program, Albuquerque, NM USA
[5] Shanghai Univ Tradit Chinese Med, LongHua Hosp, Endoscopy Ctr, Shanghai, Peoples R China
[6] Shanghai Univ Tradit Chinese Med, Bijie Inst, Bijie, Peoples R China
[7] Bijie Dist Ctr Dis Control & Prevent, Doctoral Workstat, Bijie, Peoples R China
关键词
Cancer; Pan-cancer; Prediction; Web-based; Risk; Co-management; Co-prevention; Middle-aged; China; Machine learning; HEALTH;
D O I
10.1016/j.annepidem.2024.12.003
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: From a global perspective, China is one of the countries with higher incidence and mortality rates for cancer. Objective: Our objective is to create an online cancer risk prediction tool for middle-aged and elderly Chinese adults by leveraging machine learning algorithms and self-reported data. Method: Drawing from a cohort of 19,798 participants aged 45 and above from the China Health and Retirement Longitudinal Study (2011 - 2018), we employed nine machine learning algorithms (LR: Logistic Regression, Adaboost: Adaptive Boosting, SVM: Support Vector Machine, RF: Random Forest, GNB: Gaussian Naive Bayes, GBM: Gradient Boosting Machine, LGBM: Light Gradient Boosting Machine, XGBoost: eXtreme Gradient Boosting, KNN: K - Nearest Neighbors), which are mainly used for classification and regression tasks, to construct predictive models for various cancers. Utilizing non-invasive self-reported predictors encompassing demographic, educational, marital, lifestyle, health history, and other factors, we focused on predicting "Cancer or Malignant Tumour" outcomes. The types of cancers that can be predicted mainly include lung cancer, breast cancer, cervical cancer, colorectal cancer, gastric cancer, esophageal cancer, and other rare cancers. Results: The developed tool, MyCancerRisk, demonstrated significant performance, with the Random Forest algorithm achieving an AUC of 0.75 and ACC of 0.99 using self-reported variables. Key predictors identified include age, self-rated health, sleep patterns, household heating sources, childhood health status, living conditions, and smoking habits. Conclusion: MyCancerRisk aims to serve as a preventative screening tool, encouraging individuals to undergo testing and adopt healthier behaviours to mitigate the public health impact of cancer. Our study also sheds light on unconventional predictors, such as housing conditions, offering valuable insights for refining cancer prediction models.
引用
收藏
页码:27 / 35
页数:9
相关论文
共 27 条
  • [1] Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer
    Boehm, Kevin M.
    Aherne, Emily A.
    Ellenson, Lora
    Nikolovski, Ines
    Alghamdi, Mohammed
    Vazquez-Garcia, Ignacio
    Zamarin, Dmitriy
    Roche, Kara Long
    Liu, Ying
    Patel, Druv
    Aukerman, Andrew
    Pasha, Arfath
    Rose, Doori
    Selenica, Pier
    Causa Andrieu, Pamela I.
    Fong, Chris
    Capanu, Marinela
    Reis-Filho, Jorge S.
    Vanguri, Rami
    Veeraraghavan, Harini
    Gangai, Natalie
    Sosa, Ramon
    Leung, Samantha
    McPherson, Andrew
    Gao, JianJiong
    Lakhman, Yulia
    Shah, Sohrab P.
    [J]. NATURE CANCER, 2022, 3 (06) : 723 - +
  • [2] Clinical use of machine learning-based pathomics signature for diagnosis and survival prediction of bladder cancer
    Chen, Siteng
    Jiang, Liren
    Zheng, Xinyi
    Shao, Jialiang
    Wang, Tao
    Zhang, Encheng
    Gao, Feng
    Wang, Xiang
    Zheng, Junhua
    [J]. CANCER SCIENCE, 2021, 112 (07) : 2905 - 2914
  • [3] Burden of Disease Due to Cancer - China, 2000-2019
    Fan, Xueqi
    Zhang, Bin
    He, Yuan
    Zhou, Xiaolong
    Zhang, Yingying
    Ma, Li
    Li, Xudong
    Wu, Jing
    [J]. CHINA CDC WEEKLY, 2022, 4 (15): : 306 - 311
  • [4] Feliciano EMC, 2017, CANCER EPIDEM BIOMAR, V26, P44, DOI [10.1158/1055-9965.epi-16-0150, 10.1158/1055-9965.EPI-16-0150]
  • [5] gov, 2023, Notification on the Issuance of the Implementation Plan for Healthy China Action-Cancer Prevention and Control
  • [6] Sex-specific Association of Primary Aldosteronism With Visceral Adiposity
    Hatano, Yu
    Sawayama, Nagisa
    Miyashita, Hiroshi
    Kurashina, Tomoyuki
    Okada, Kenta
    Takahashi, Manabu
    Matsumoto, Masatoshi
    Hoshide, Satoshi
    Sasaki, Takahiro
    Nagashima, Shuichi
    Ebihara, Ken
    Mori, Harushi
    Kario, Kazuomi
    Ishibashi, Shun
    [J]. JOURNAL OF THE ENDOCRINE SOCIETY, 2022, 6 (08)
  • [7] Prediction of lung malignancy progression and survival with machine learning based on pre-treatment FDG-PET/CT
    Huang, Brian
    Sollee, John
    Luo, Yong-Heng
    Reddy, Ashwin
    Zhong, Zhusi
    Wu, Jing
    Mammarappallil, Joseph
    Healey, Terrance
    Cheng, Gang
    Azzoli, Christopher
    Korogodsky, Dana
    Zhang, Paul
    Feng, Xue
    Li, Jie
    Yang, Li
    Jiao, Zhicheng
    Bai, Harrison Xiao
    [J]. EBIOMEDICINE, 2022, 82
  • [8] Machine learning predicts cancer-associated deep vein thrombosis using clinically available variables
    Jin, Shuai
    Qin, Dan
    Liang, Bao-Sheng
    Zhang, Li-Chuan
    Wei, Xiao-Xia
    Wang, Yu-Jie
    Zhuang, Bing
    Zhang, Tong
    Yang, Zhen-Peng
    Cao, Yi-Wei
    Jin, San-Li
    Yang, Ping
    Jiang, Bo
    Rao, Ben-Qiang
    Shi, Han-Ping
    Lu, Qian
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2022, 161
  • [9] LASSO-Based Machine Learning Algorithm for Prediction of Lymph Node Metastasis in T1 Colorectal Cancer
    Kang, Jeonghyun
    Choi, Yoon Jung
    Kim, Im-Kyung
    Lee, Hye Sun
    Kim, Hogeun
    Baik, Seung Hyuk
    Kim, Nam Kyu
    Lee, Kang Young
    [J]. CANCER RESEARCH AND TREATMENT, 2021, 53 (03): : 773 - 783
  • [10] Education, income and risk of cancer: results from a Norwegian registry-based study
    Larsen, Inger Kristin
    Myklebust, Tor age
    Babigumira, Ronnie
    Vinberg, Elina
    Moller, Bjorn
    Ursin, Giske
    [J]. ACTA ONCOLOGICA, 2020, 59 (11) : 1300 - 1307