A Comparative Study of Data Mining Techniques Applied to Renal-Cell Carcinomas

被引:1
作者
Duarte, Ana [1 ]
Peixoto, Hugo [2 ]
Machado, Jose [2 ]
机构
[1] Univ Minho, Campus Gualtar, Braga, Portugal
[2] Univ Minho, Ctr Algoritmi, Campus Gualtar, Braga, Portugal
来源
IOT TECHNOLOGIES FOR HEALTH CARE, HEALTHYIOT 2021 | 2022年 / 432卷
关键词
Renal-Cell Carcinoma; Data Mining; Survival; Life expectancy; RapidMiner;
D O I
10.1007/978-3-030-99197-5_5
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Despite being one of the deadliest diseases and the enormous evolution in fighting it, the best methods to predict kidney cancer, namely Renal-Cell Carcinomas (RCC), are not well-known. One of the solutions to accelerate the current knowledge about RCC is through the use of Data Mining techniques based on patients' personal and clinical data. Therefore, it is crucial to understand which techniques are the most suitable to extract knowledge about this disease. In this paper, we followed the CRISP-DM methodology to simulate different techniques to determine the ones with the best predictive performance. For this purpose, we used a dataset of 821 records of RCC patients, obtained from The Cancer Genome Atlas. The present work tests different Data Mining techniques, that can be used to predict the 5-year life expectancy of patients with renal cancer and to predict the number of days to death for patients who have a life expectancy of less than 5 years. The results obtained demonstrated that the best algorithm for estimating the vital status at 5 years was Random Forest. This algorithm presented an accuracy of 87.65% and an AUROC of 0.931. For the prediction of days to death, the best performance was obtained with the k-Nearest Neighbors algorithm with a root mean square error of 354.6 days. The work suggested that Data Mining techniques can help to understand the influence of various risk factors on the life expectancy of patients with RCC.
引用
收藏
页码:53 / 62
页数:10
相关论文
共 17 条
  • [1] American Cancer Society, SURV RAT KIDN CANC
  • [2] Bierley JD., 2017, UICC TNM Classification of Malignant Tumours. Digestive System Tumours, V8th
  • [3] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [4] Systemic Therapy for Metastatic Renal-Cell Carcinoma
    Choueiri, Toni K.
    Motzer, Robert J.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2017, 376 (04) : 354 - 366
  • [5] Dickie L., 2020, SOLID TUMOR RULES
  • [6] Genomic profiling in renal cell carcinoma
    Dizman, Nazli
    Philip, Errol J.
    Pal, Sumanta K.
    [J]. NATURE REVIEWS NEPHROLOGY, 2020, 16 (08) : 435 - 451
  • [7] Toward a Shared Vision for Cancer Genomic Data
    Grossman, Robert L.
    Heath, Allison P.
    Ferretti, Vincent
    Varmus, Harold E.
    Lowy, Douglas R.
    Kibbe, Warren A.
    Staudt, Louis M.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2016, 375 (12) : 1109 - 1112
  • [8] Renal cell carcinoma
    Hsieh, James J.
    Purdue, Mark P.
    Signoretti, Sabina
    Swanton, Charles
    Albiges, Laurence
    Schmidinger, Manuela
    Heng, Daniel Y.
    Larkin, James
    Ficarra, Vincenzo
    [J]. NATURE REVIEWS DISEASE PRIMERS, 2017, 3
  • [9] Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms
    Zeenia Jagga
    Dinesh Gupta
    [J]. BMC Proceedings, 8 (Suppl 6)
  • [10] Predicting the need of Neonatal Resuscitation using Data Mining
    Morais, Ana
    Peixoto, Hugo
    Coimbra, Cecilia
    Abelha, Antonio
    Machado, Jose
    [J]. 8TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2017) / 7TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2017) / AFFILIATED WORKSHOPS, 2017, 113 : 571 - 576