Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review

被引:1
作者
Moglia, Victoria [1 ]
Johnson, Owen [1 ]
Cook, Gordon [2 ,3 ]
de Kamps, Marc [1 ]
Smith, Lesley [2 ]
机构
[1] Univ Leeds, Sch Comp, Woodhouse Lane, Leeds LS2 9JT, England
[2] Univ Leeds, Leeds Inst Clin Trials Res, Clarendon Way, Leeds LS2 9NL, England
[3] NIHR Leeds Biomed Res Ctr, Chapeltown Rd, Leeds LS7 4SA, England
基金
英国科研创新办公室;
关键词
Machine learning; Health data; Longitudinal data; Cancer; Time-series; Temporal; Artificial intelligence; DEEP LEARNING ALGORITHM; COLORECTAL-CANCER; PANCREATIC-CANCER; RISK PREDICTION; EARLY-DIAGNOSIS; TIME; MODELS;
D O I
10.1186/s12874-025-02473-w
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BackgroundEarly detection and diagnosis of cancer are vital to improving outcomes for patients. Artificial intelligence (AI) models have shown promise in the early detection and diagnosis of cancer, but there is limited evidence on methods that fully exploit the longitudinal data stored within electronic health records (EHRs). This review aims to summarise methods currently utilised for prediction of cancer from longitudinal data and provides recommendations on how such models should be developed.MethodsThe review was conducted following PRISMA-ScR guidance. Six databases (MEDLINE, EMBASE, Web of Science, IEEE Xplore, PubMed and SCOPUS) were searched for relevant records published before 2/2/2024. Search terms related to the concepts "artificial intelligence", "prediction", "health records", "longitudinal", and "cancer". Data were extracted relating to several areas of the articles: (1) publication details, (2) study characteristics, (3) input data, (4) model characteristics, (4) reproducibility, and (5) quality assessment using the PROBAST tool. Models were evaluated against a framework for terminology relating to reporting of cancer detection and risk prediction models.ResultsOf 653 records screened, 33 were included in the review; 10 predicted risk of cancer, 18 performed either cancer detection or early detection, 4 predicted recurrence, and 1 predicted metastasis. The most common cancers predicted in the studies were colorectal (n = 9) and pancreatic cancer (n = 9). 16 studies used feature engineering to represent temporal data, with the most common features representing trends. 18 used deep learning models which take a direct sequential input, most commonly recurrent neural networks, but also including convolutional neural networks and transformers. Prediction windows and lead times varied greatly between studies, even for models predicting the same cancer. High risk of bias was found in 90% of the studies. This risk was often introduced due to inappropriate study design (n = 26) and sample size (n = 26).ConclusionThis review highlights the breadth of approaches to cancer prediction from longitudinal data. We identify areas where reporting of methods could be improved, particularly regarding where in a patients' trajectory the model is applied. The review shows opportunities for further work, including comparison of these approaches and their applications in other cancers.
引用
收藏
页数:17
相关论文
共 69 条
  • [11] Lung Cancer Prediction Using Electronic Claims Records: A Transformer-Based Approach
    Chen, Huan-Yu
    Wang, Hui-Min
    Lin, Ching-Heng
    Yang, Rob
    Lee, Chi-Chun
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (12) : 6062 - 6073
  • [12] Machine learning versus regression for prediction of sporadic pancreatic cancer
    Chen, Wansu
    Zhou, Botao
    Jeon, Christie Y.
    Xie, Fagen
    Lin, Yu-Chen
    Butler, Rebecca K.
    Zhou, Yichen
    Luong, Tiffany Q.
    Lustigova, Eva
    Pisegna, Joseph R.
    Wu, Bechien U.
    [J]. PANCREATOLOGY, 2023, 23 (04) : 396 - 402
  • [13] Risk Prediction of Pancreatic Cancer in Patients With Recent-onset Hyperglycemia A Machine-learning Approach
    Chen, Wansu
    Butler, Rebecca K.
    Lustigova, Eva
    Chari, Suresh T.
    Maitra, Anirban
    Rinaudo, Jo A.
    Wu, Bechien U.
    [J]. JOURNAL OF CLINICAL GASTROENTEROLOGY, 2023, 57 (01) : 103 - 110
  • [14] Derivation and External Validation of Machine Learning-Based Model for Detection of Pancreatic Cancer
    Chen, Wansu
    Zhou, Yichen
    Xie, Fagen
    Butler, Rebecca K.
    Jeon, Christie Y.
    Luong, Tiffany Q.
    Zhou, Botao
    Lin, Yu-Chen
    Lustigova, Eva
    Pisegna, Joseph R.
    Kim, Sungjin
    Wu, Bechien U.
    [J]. AMERICAN JOURNAL OF GASTROENTEROLOGY, 2023, 118 (01) : 157 - 167
  • [15] Choi E, 2016, ADV NEUR IN, V29
  • [16] Prediction of pancreatic cancer risk in patients with new-onset diabetes using a machine learning approach based on routine biochemical parameters
    Cichosz, Simon Lebech
    Jensen, Morten Hasselstrom
    Hejlesen, Ole
    Henriksen, Stine Dam
    Drewes, Asbjorn Mohr
    Olesen, Soren Schou
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
  • [17] TRIPOD plus AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods
    Collins, Gary S.
    Moons, Karel G. M.
    Dhiman, Paula
    Riley, Richard
    Beam, Andrew L.
    Van Calster, Ben
    Ghassemi, Marzyeh
    Liu, Xiaoxuan
    Reitsma, Johannes B.
    van Smeden, Maarten
    Boulesteix, Anne-Laure
    Camaradou, Jennifer Catherine
    Celi, Leo Anthony
    Denaxas, Spiros
    Denniston, Alastair K.
    Glocker, Ben
    Golub, Robert M.
    Harvey, Hugh
    Heinze, Georg
    Hoffman, Michael M.
    Kengne, Andre Pascal
    Lam, Emily
    Lee, Naomi
    Loder, Elizabeth W.
    Maier-Hein, Lena
    Mateen, Bilal A.
    McCradden, Melissa
    Oakden-Rayner, Lauren
    Ordish, Johan
    Parnell, Richard
    Rose, Sherri
    Singh, Karandeep
    Wynants, Laure
    Logullo, Patricia
    [J]. BMJ-BRITISH MEDICAL JOURNAL, 2024, 385
  • [18] Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence
    Collins, Gary S.
    Dhiman, Paula
    Andaur Navarro, Constanza L.
    Ma, Ji
    Hooft, Lotty
    Reitsma, Johannes B.
    Logullo, Patricia
    Beam, Andrew L.
    Peng, Lily
    Van Calster, Ben
    van Smeden, Maarten
    Riley, Richard D.
    Moons, Karel G. M.
    [J]. BMJ OPEN, 2021, 11 (07):
  • [19] Within-subject and between-subject biological variation estimates of 21 hematological parameters in 30 healthy subjects
    Coskun, Abdurrahman
    Carobene, Anna
    Kilercik, Meltem
    Serteser, Mustafa
    Sandberg, Sverre
    Aarsand, Aasne K.
    Fernandez-Calle, Pilar
    Jonker, Niels
    Bartlett, William A.
    Diaz-Garzon, Jorge
    Huet, Sibel
    Kiziltas, Cansu
    Dalgakiran, Ilayda
    Ugur, Esra
    Unsal, Ibrahim
    [J]. CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2018, 56 (08) : 1309 - 1318
  • [20] DIABETES-MELLITUS AS A RISK FACTOR FOR PANCREATIC-CANCER - A METAANALYSIS
    EVERHART, J
    WRIGHT, D
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1995, 273 (20): : 1605 - 1609