A review of statistical and machine learning methods for modeling cancer risk using structured clinical data

被引:99
作者
Richter, Aaron N. [1 ,2 ]
Khoshgoftaar, Taghi M. [1 ]
机构
[1] Florida Atlantic Univ, Boca Raton, FL 33431 USA
[2] Modernizing Med Inc, Boca Raton, FL 33431 USA
关键词
Cancer prediction; Cancer recurrence; Cancer relapse; Data mining; Machine learning; Electronic health records; HEPATOCELLULAR-CARCINOMA; PREDICTIVE MODELS; PROGNOSTIC INDEX; LOCAL RECURRENCE; SURGERY; IDENTIFICATION; NOMOGRAM; FUTURE;
D O I
10.1016/j.artmed.2018.06.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Advancements are constantly being made in oncology, improving prevention and treatment of cancers. To help reduce the impact and deadliness of cancers, they must be detected early. Additionally, there is a risk of cancers recurring after potentially curative treatments are performed. Predictive models can be built using historical patient data to model the characteristics of patients that developed cancer or relapsed. These models can then be deployed into clinical settings to determine if new patients are at high risk for cancer development or recurrence. For large-scale predictive models to be built, structured data must be captured for a wide range of diverse patients. This paper explores current methods for building cancer risk models using structured clinical patient data. Trends in statistical and machine learning techniques are explored, and gaps are identified for future research. The field of cancer risk prediction is a high-impact one, and research must continue for these models to be embraced for clinical decision support of both practitioners and patients.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 64 条
[21]  
Edge S.B., 2010, AJCC cancer staging manual, V649
[22]   A New Laboratory-Based Algorithm to Predict Development of Hepatocellular Carcinoma in Patients With Hepatitis C and Cirrhosis [J].
El-Serag, Hashem B. ;
Kanwal, Fasiha ;
Davila, Jessica A. ;
Kramer, Jennifer ;
Richardson, Peter .
GASTROENTEROLOGY, 2014, 146 (05) :1249-+
[23]   Prediction Model for Gastric Cancer Incidence in Korean Population [J].
Eom, Bang Wool ;
Joo, Jungnam ;
Kim, Sohee ;
Shin, Aesun ;
Yang, Hye-Ryung ;
Park, Junghyun ;
Choi, Il Ju ;
Kim, Young-Woo ;
Kim, Jeongseon ;
Nam, Byung-Ho .
PLOS ONE, 2015, 10 (07)
[24]  
Eshlaghy AbbasToloie., 2013, Journal of Health and Medicine Information, V4, P124, DOI DOI 10.4172/2157-7420.1000124
[25]   THE NOTTINGHAM PROGNOSTIC INDEX IN PRIMARY BREAST-CANCER [J].
GALEA, MH ;
BLAMEY, RW ;
ELSTON, CE ;
ELLIS, IO .
BREAST CANCER RESEARCH AND TREATMENT, 1992, 22 (03) :207-219
[26]   HD CAGnome: A Search Tool for Huntingtin CAG Repeat Length-Correlated Genes [J].
Galkina, Ekaterina I. ;
Shin, Aram ;
Coser, Kathryn R. ;
Shioda, Toshi ;
Kohane, Isaac S. ;
Seong, Ihn Sik ;
Wheeler, Vanessa C. ;
Gusella, James F. ;
MacDonald, Marcy E. ;
Lee, Jong-Min .
PLOS ONE, 2014, 9 (04)
[27]   Choosing software metrics for defect prediction: an investigation on feature selection techniques [J].
Gao, Kehan ;
Khoshgoftaar, Taghi M. ;
Wang, Huanjing ;
Seliya, Naeem .
SOFTWARE-PRACTICE & EXPERIENCE, 2011, 41 (05) :579-606
[28]   Thresholds for therapies: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2009 [J].
Goldhirsch, A. ;
Ingle, J. N. ;
Gelber, R. D. ;
Coates, A. S. ;
Thuerlimann, B. ;
Senn, H. -J. .
ANNALS OF ONCOLOGY, 2009, 20 (08) :1319-1329
[29]   Building an Effective Classification Model for Breast Cancer Patient Response Data [J].
Heredia, Brian ;
Khoshgoftaar, Taghi M. ;
Fazelpour, Alireza ;
Dittman, David J. .
2015 IEEE 16TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2015, :229-235
[30]   A review of data mining using big data in health informatics [J].
Herland M. ;
Khoshgoftaar T.M. ;
Wald R. .
Journal of Big Data, 2014, 1 (01)