Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry

被引:67
作者
Gupta, Sunil [1 ]
Truyen Tran [1 ,2 ]
Luo, Wei [1 ]
Dinh Phung [1 ]
Kennedy, Richard Lee [3 ]
Broad, Adam [4 ]
Campbell, David [4 ]
Kipp, David [4 ]
Singh, Madhu [4 ]
Khasraw, Mustafa [3 ,4 ]
Matheson, Leigh [5 ]
Ashley, David M. [3 ,4 ,5 ]
Venkatesh, Svetha [1 ]
机构
[1] Deakin Univ, Ctr Pattern Recognit & Data Analyt, Geelong, Vic 3217, Australia
[2] Curtin Univ, Dept Comp, Perth, WA 6845, Australia
[3] Deakin Univ, Sch Med, Geelong, Vic 3217, Australia
[4] Barwon Hlth, Andrew Love Canc Ctr, Geelong, Vic, Australia
[5] Barwon Southwest Integrated Canc Serv, Geelong, Vic, Australia
来源
BMJ OPEN | 2014年 / 4卷 / 03期
关键词
Survival; Prediction; Machine Learning; Electronic Medical Record; Cancer; POPULATION-BASED COHORT; BREAST-CANCER; IMPACT; COMORBIDITY; NETWORKS; HEALTH; MODELS;
D O I
10.1136/bmjopen-2013-004007
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objectives Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical outcomes. Setting A regional cancer centre in Australia. Participants Disease-specific data from a purpose-built cancer registry (Evaluation of Cancer Outcomes (ECO)) from 869 patients were used to predict survival at 6, 12 and 24 months. The model was validated with data from a further 94 patients, and results compared to the assessment of five specialist oncologists. Machine-learning prediction using ECO data was compared with that using EAR and a model combining ECO and EAR data. Primary and secondary outcome measures Survival prediction accuracy in terms of the area under the receiver operating characteristic curve (AUC). Results The ECO model yielded AUCs of 0.87 (95% CI 0.848 to 0.890) at 6 months, 0.796 (95% CI 0.774 to 0.823) at 12 months and 0.764 (95% CI 0.737 to 0.789) at 24 months. Each was slightly better than the performance of the clinician panel. The model performed consistently across a range of cancers, including rare cancers. Combining ECO and EAR data yielded better prediction than the ECO-based model (AUCs ranging from 0.757 to 0.997 for 6 months, AUCs from 0.689 to 0.988 for 12 months and AUCs from 0.713 to 0.973 for 24 months). The best prediction was for genitourinary, head and neck, lung, skin, and upper gastrointestinal tumours. Conclusions Machine learning applied to information from a disease-specific (cancer) database and the EAR can be used to predict clinical outcomes. Importantly, the approach described made use of digital data that is already routinely collected but underexploited by clinical health systems.
引用
收藏
页数:7
相关论文
共 28 条
[1]  
[Anonymous], 1984, Analysis of survival data
[2]   Meaningful Use of Electronic Health Record Systems and Process Quality of Care: Evidence from a Panel Data Analysis of U.S. Acute-Care Hospitals [J].
Appari, Ajit ;
Johnson, M. Eric ;
Anthony, Denise L. .
HEALTH SERVICES RESEARCH, 2013, 48 (02) :354-375
[3]   Prognostic Impact of Comorbidity among Long-Term Breast Cancer Survivors: Results from the LACE Study [J].
Braithwaite, Dejana ;
Moore, Dan H. ;
Satariano, William A. ;
Kwan, Marilyn L. ;
Hiatt, Robert A. ;
Kroenke, Candyce ;
Caan, Bette J. .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2012, 21 (07) :1115-1125
[4]  
Burke HB, 1997, CANCER, V79, P857, DOI 10.1002/(SICI)1097-0142(19970215)79:4<857::AID-CNCR24>3.0.CO
[5]  
2-Y
[6]   The Combined Effect of Individual and Neighborhood Socioeconomic Status on Cancer Survival Rates [J].
Chang, Chun-Ming ;
Su, Yu-Chieh ;
Lai, Ning-Sheng ;
Huang, Kuang-Yung ;
Chien, Sou-Hsin ;
Chang, Yu-Han ;
Lian, Wei-Cheng ;
Hsu, Ta-Wen ;
Lee, Ching-Chih .
PLOS ONE, 2012, 7 (08)
[7]   Assessment of reproducibility of cancer survival risk predictions across medical centers [J].
Chen, Hung-Chia ;
Chen, James J. .
BMC MEDICAL RESEARCH METHODOLOGY, 2013, 13
[8]   Assessment of performance of survival prediction models for cancer prognosis [J].
Chen, Hung-Chia ;
Kodell, Ralph L. ;
Cheng, Kuang Fu ;
Chen, James J. .
BMC MEDICAL RESEARCH METHODOLOGY, 2012, 12
[9]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[10]   Exploring the Frontier of Electronic Health Record Surveillance The Case of Postoperative Complications [J].
FitzHenry, Fern ;
Murff, Harvey J. ;
Matheny, Michael E. ;
Gentry, Nancy ;
Fielstein, Elliot M. ;
Brown, Steven H. ;
Reeves, Ruth M. ;
Aronsky, Dominik ;
Elkin, Peter L. ;
Messina, Vincent P. ;
Speroff, Theodore .
MEDICAL CARE, 2013, 51 (06) :509-516