Predicting second breast cancer among women with primary breast cancer using machine learning algorithms, a population-based observational study

被引:3
|
作者
Syleouni, Maria-Eleni [1 ,2 ]
Karavasiloglou, Nena [1 ,3 ]
Manduchi, Laura [4 ]
Wanner, Miriam [2 ]
Korol, Dimitri [2 ]
Ortelli, Laura [5 ]
Bordoni, Andrea [5 ]
Rohrmann, Sabine [1 ,2 ,6 ]
机构
[1] Univ Zurich, Epidemiol Biostat & Prevent Inst, Div Chron Dis Epidemiol, Zurich, Switzerland
[2] Univ Hosp Zurich, Canc Registry Zurich Zug Schaffhausen & Schwyz, Zurich, Switzerland
[3] European Food Safety Author, Parma, Italy
[4] Swiss Fed Inst Technol, Med Data Sci, Zurich, Switzerland
[5] Ticino Canc Registry, Publ Hlth Div Canton Ticino, Locarno, Switzerland
[6] Univ Zurich, Epidemiol Biostat & Prevent Inst, Hirschengraben 84, CH-8001 Zurich, Switzerland
关键词
breast cancer; cancer registry; machine learning; prediction; second cancer; RISK-FACTORS; LOCAL RECURRENCE; PROGNOSIS;
D O I
10.1002/ijc.34568
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Breast cancer survivors often experience recurrence or a second primary cancer. We developed an automated approach to predict the occurrence of any second breast cancer (SBC) using patient-level data and explored the generalizability of the models with an external validation data source. Breast cancer patients from the cancer registry of Zurich, Zug, Schaffhausen, Schwyz (N = 3213; training dataset) and the cancer registry of Ticino (N = 1073; external validation dataset), diagnosed between 2010 and 2018, were used for model training and validation, respectively. Machine learning (ML) methods, namely a feed-forward neural network (ANN), logistic regression, and extreme gradient boosting (XGB) were employed for classification. The best-performing model was selected based on the receiver operating characteristic (ROC) curve. Key characteristics contributing to a high SBC risk were identified. SBC was diagnosed in 6% of all cases. The most important features for SBC prediction were age at incidence, year of birth, stage, and extent of the pathological primary tumor. The ANN model had the highest area under the ROC curve with 0.78 (95% confidence interval [CI] 0.750.82) in the training data and 0.70 (95% CI 0.61-0.79) in the external validation data. Investigating the generalizability of different ML algorithms, we found that the ANN generalized better than the other models on the external validation data. This research is a first step towards the development of an automated tool that could assist clinicians in the identification of women at high risk of developing an SBC and potentially preventing it.
引用
收藏
页码:932 / 941
页数:10
相关论文
共 50 条
  • [1] Predicting the recurrence of breast cancer using machine learning algorithms
    Alzu'bi, Amal
    Najadat, Hassan
    Doulat, Wesam
    Al-Shari, Osama
    Zhou, Leming
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (09) : 13787 - 13800
  • [2] Predicting the recurrence of breast cancer using machine learning algorithms
    Amal Alzu’bi
    Hassan Najadat
    Wesam Doulat
    Osama Al-Shari
    Leming Zhou
    Multimedia Tools and Applications, 2021, 80 : 13787 - 13800
  • [3] Predicting and Classifying Breast Cancer Using Machine Learning
    Alkhathlan, Lina
    Saudagar, Abdul Khader Jilani
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2022, 29 (06) : 497 - 514
  • [4] Second Primary Lung Cancer After Breast Cancer: A Population-Based Study of 6,269 Women
    Wang, Rong
    Yin, Zhiqiang
    Liu, Lingxiang
    Gao, Wen
    Li, Wei
    Shu, Yongqian
    Xu, Jiali
    FRONTIERS IN ONCOLOGY, 2018, 8
  • [5] Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development
    Hou, Can
    Zhong, Xiaorong
    He, Ping
    Xu, Bin
    Diao, Sha
    Yi, Fang
    Zheng, Hong
    Li, Jiayuan
    JMIR MEDICAL INFORMATICS, 2020, 8 (06)
  • [6] An Ontological Model based on Machine Learning for Predicting Breast Cancer
    El Massari, Hakim
    Gherabi, Noreddine
    Mhammedi, Sajida
    Ghandi, Hamza
    Qanouni, Fatima
    Bahaj, Mohamed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (07) : 108 - 115
  • [7] Second primary cancers among females with a first primary breast cancer: a population-based study in Northern Portugal
    Goncalves, Elisabete
    Fontes, Filipa
    Rodrigues, Jessica Rocha
    Calisto, Rita
    Bento, Maria Jose
    Lunet, Nuno
    Morais, Samantha
    BREAST CANCER RESEARCH AND TREATMENT, 2024, 204 (02) : 367 - 376
  • [8] Using Machine Learning Algorithms for Breast Cancer Diagnosis
    El-Lamey, Mazen Mobtasem
    Eid, Mohab Mohammed
    Gamal, Muhammad
    Bishady, Nour-Elhoda Mohamed
    Mohamed, Ali Wagdy
    INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING, 2021, 12 (04) : 117 - 154
  • [9] Distribution of Second Primary Malignancies Suggests a Bidirectional Effect Between Breast and Endometrial Cancer A Population-Based Study
    Cortesi, Laura
    De Matteis, Elisabetta
    Rashid, Ivan
    Cirilli, Claudia
    Proietto, Manuela
    Rivasi, Francesco
    Federico, Massimo
    INTERNATIONAL JOURNAL OF GYNECOLOGICAL CANCER, 2009, 19 (08) : 1358 - 1363
  • [10] Association between Breast Cancer and Second Primary Lung Cancer among the Female Population in Taiwan: A Nationwide Population-Based Cohort Study
    Lin, Fan-Wen
    Yeh, Ming-Hsin
    Lin, Cheng-Li
    Wei, James Cheng-Chung
    CANCERS, 2022, 14 (12)