Data set terminology of deep learning in medicine: a historical review and recommendation

被引:6
|
作者
Walston, Shannon L. [1 ]
Seki, Hiroshi [1 ]
Takita, Hirotaka [1 ]
Mitsuyama, Yasuhito [1 ]
Sato, Shingo [2 ]
Hagiwara, Akifumi [3 ]
Ito, Rintaro [4 ]
Hanaoka, Shouhei [5 ]
Miki, Yukio [1 ]
Ueda, Daiju [1 ,6 ,7 ]
机构
[1] Osaka Metropolitan Univ, Grad Sch Med, Dept Diagnost & Intervent Radiol, Osaka, Japan
[2] Thomas Jefferson Univ, Sidney Kimmel Canc Ctr, Philadelphia, PA USA
[3] Juntendo Univ, Sch Med, Dept Radiol, Tokyo, Japan
[4] Nagoya Univ, Dept Radiol, Nagoya, Japan
[5] Univ Tokyo Hosp, Dept Radiol, Tokyo, Japan
[6] Osaka Metropolitan Univ, Grad Sch Med, Dept Artificial Intelligence, Osaka, Japan
[7] Osaka Metropolitan Univ, Ctr Hlth Sci Innovat, Osaka, Japan
关键词
Terminology; Artificial intelligence; Deep learning; Data partition; Data splitting; ARTIFICIAL-INTELLIGENCE; VALIDATION; MODEL; PROGNOSIS; TOOL;
D O I
10.1007/s11604-024-01608-1
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.
引用
收藏
页码:1100 / 1109
页数:10
相关论文
共 50 条
  • [1] A review of cancer data fusion methods based on deep learning
    Zhao, Yuxin
    Li, Xiaobo
    Zhou, Changjun
    Peng, Hao
    Zheng, Zhonglong
    Chen, Jun
    Ding, Weiping
    INFORMATION FUSION, 2024, 108
  • [2] Data Science in Economics: Comprehensive Review of Advanced Machine Learning and Deep Learning Methods
    Nosratabadi, Saeed
    Mosavi, Amirhosein
    Puhong Duan
    Ghamisi, Pedram
    Filip, Ferdinand
    Band, Shahab S.
    Reuter, Uwe
    Gama, Joao
    Gandomi, Amir H.
    MATHEMATICS, 2020, 8 (10) : 1 - 25
  • [3] Deep Learning for Outcome Prediction in Neurosurgery: A Systematic Review of Design, Reporting, and Reproducibility
    Huang, Jonathan
    Shlobin, Nathan A.
    DeCuypere, Michael
    Lam, Sandi K.
    NEUROSURGERY, 2022, 90 (01) : 16 - 38
  • [4] Deep Learning and Big Data in Healthcare: A Double Review for Critical Beginners
    Bote-Curiel, Luis
    Munoz-Romero, Sergio
    Gerrero-Curieses, Alicia
    Luis Rojo-Alvarez, Jose
    APPLIED SCIENCES-BASEL, 2019, 9 (11):
  • [5] Ovarian cancer data analysis using deep learning: A systematic review
    Hira, Muta Tah
    Razzaque, Mohammad A.
    Sarker, Mosharraf
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [6] From machine learning to deep learning: Advances of the recent data-driven paradigm shift in medicine and healthcare
    Chakraborty, Chiranjib
    Bhattacharya, Manojit
    Pal, Soumen
    Lee, Sang-Soo
    CURRENT RESEARCH IN BIOTECHNOLOGY, 2024, 7
  • [7] Deep Learning for Radiotherapy Outcome Prediction Using Dose Data-A Review
    Appelt, A. L.
    Elhaminia, B.
    Gooya, A.
    Gilbert, A.
    Nix, M.
    CLINICAL ONCOLOGY, 2022, 34 (02) : E87 - E96
  • [8] A review of deep learning in dentistry
    Huang, Chenxi
    Wang, Jiaji
    Wang, Shuihua
    Zhang, Yudong
    NEUROCOMPUTING, 2023, 554
  • [9] A Review on Recent Progress in Machine Learning and Deep Learning Methods for Cancer Classification on Gene Expression Data
    Mazlan, Aina Umairah
    Sahabudin, Noor Azida
    Remli, Muhammad Akmal
    Ismail, Nor Syahidatul Nadiah
    Mohamad, Mohd Saberi
    Nies, Hui Wen
    Abd Warif, Nor Bakiah
    PROCESSES, 2021, 9 (08)
  • [10] Deep Reinforcement Learning in Medicine
    Jonsson, Anders
    KIDNEY DISEASES, 2019, 5 (01) : 18 - 22