Data set terminology of deep learning in medicine: a historical review and recommendation

被引:6
作者
Walston, Shannon L. [1 ]
Seki, Hiroshi [1 ]
Takita, Hirotaka [1 ]
Mitsuyama, Yasuhito [1 ]
Sato, Shingo [2 ]
Hagiwara, Akifumi [3 ]
Ito, Rintaro [4 ]
Hanaoka, Shouhei [5 ]
Miki, Yukio [1 ]
Ueda, Daiju [1 ,6 ,7 ]
机构
[1] Osaka Metropolitan Univ, Grad Sch Med, Dept Diagnost & Intervent Radiol, Osaka, Japan
[2] Thomas Jefferson Univ, Sidney Kimmel Canc Ctr, Philadelphia, PA USA
[3] Juntendo Univ, Sch Med, Dept Radiol, Tokyo, Japan
[4] Nagoya Univ, Dept Radiol, Nagoya, Japan
[5] Univ Tokyo Hosp, Dept Radiol, Tokyo, Japan
[6] Osaka Metropolitan Univ, Grad Sch Med, Dept Artificial Intelligence, Osaka, Japan
[7] Osaka Metropolitan Univ, Ctr Hlth Sci Innovat, Osaka, Japan
关键词
Terminology; Artificial intelligence; Deep learning; Data partition; Data splitting; ARTIFICIAL-INTELLIGENCE; VALIDATION; MODEL; PROGNOSIS; TOOL;
D O I
10.1007/s11604-024-01608-1
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.
引用
收藏
页码:1100 / 1109
页数:10
相关论文
共 50 条
  • [41] Deep learning based hashtag recommendation system for multimedia data
    Djenouri, Youcef
    Belhadi, Asma
    Srivastava, Gautam
    Lin, Jerry Chun -Wei
    INFORMATION SCIENCES, 2022, 609 : 1506 - 1517
  • [42] Deep learning for cardiovascular medicine: a practical primer
    Krittanawong, Chayakrit
    Johnson, Kipp W.
    Rosenson, Robert S.
    Wang, Zhen
    Aydar, Mehmet
    Baber, Usman
    Min, James K.
    Tang, W. H. Wilson
    Halperin, Jonathan L.
    Narayan, Sanjiv M.
    EUROPEAN HEART JOURNAL, 2019, 40 (25) : 2058 - +
  • [43] A review on quantum computing and deep learning algorithms and their applications
    Valdez, Fevrier
    Melin, Patricia
    SOFT COMPUTING, 2023, 27 (18) : 13217 - 13236
  • [44] A Review of Deep Learning Research
    Mu, Ruihui
    Zeng, Xiaoqin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2019, 13 (04): : 1738 - 1764
  • [45] A review on deep learning applications in highly multiplexed tissue imaging data analysis
    Zidane, Mohammed
    Makky, Ahmad
    Bruhns, Matthias
    Rochwarger, Alexander
    Babaei, Sepideh
    Claassen, Manfred
    Schuerch, Christian M.
    FRONTIERS IN BIOINFORMATICS, 2023, 3
  • [46] A Comprehensive Review of Group Recommendation Methods Based on Deep Learning
    Zheng, Nan
    Zhang, Song
    Liu, Yu-Qiao
    Wang, Yu-Tong
    Wang, Fei-Yue
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (12): : 2301 - 2324
  • [47] Machine learning and deep learning based predictive quality in manufacturing: a systematic review
    Tercan, Hasan
    Meisen, Tobias
    JOURNAL OF INTELLIGENT MANUFACTURING, 2022, 33 (07) : 1879 - 1905
  • [48] Sounding out the hidden data: A concise review of deep learning in photoacoustic imaging
    DiSpirito, Anthony, III
    Vu, Tri
    Pramanik, Manojit
    Yao, Junjie
    EXPERIMENTAL BIOLOGY AND MEDICINE, 2021, 246 (12) : 1355 - 1367
  • [49] Deep Learning in Alzheimer's Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data
    Jo, Taeho
    Nho, Kwangsik
    Saykin, Andrew J.
    FRONTIERS IN AGING NEUROSCIENCE, 2019, 11
  • [50] Multi-Modal Deep Learning Diagnosis of Parkinson's Disease-A Systematic Review
    Skaramagkas, Vasileios
    Pentari, Anastasia
    Kefalopoulou, Zinovia
    Tsiknakis, Manolis
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 2399 - 2423