Data set terminology of deep learning in medicine: a historical review and recommendation

被引:6
|
作者
Walston, Shannon L. [1 ]
Seki, Hiroshi [1 ]
Takita, Hirotaka [1 ]
Mitsuyama, Yasuhito [1 ]
Sato, Shingo [2 ]
Hagiwara, Akifumi [3 ]
Ito, Rintaro [4 ]
Hanaoka, Shouhei [5 ]
Miki, Yukio [1 ]
Ueda, Daiju [1 ,6 ,7 ]
机构
[1] Osaka Metropolitan Univ, Grad Sch Med, Dept Diagnost & Intervent Radiol, Osaka, Japan
[2] Thomas Jefferson Univ, Sidney Kimmel Canc Ctr, Philadelphia, PA USA
[3] Juntendo Univ, Sch Med, Dept Radiol, Tokyo, Japan
[4] Nagoya Univ, Dept Radiol, Nagoya, Japan
[5] Univ Tokyo Hosp, Dept Radiol, Tokyo, Japan
[6] Osaka Metropolitan Univ, Grad Sch Med, Dept Artificial Intelligence, Osaka, Japan
[7] Osaka Metropolitan Univ, Ctr Hlth Sci Innovat, Osaka, Japan
关键词
Terminology; Artificial intelligence; Deep learning; Data partition; Data splitting; ARTIFICIAL-INTELLIGENCE; VALIDATION; MODEL; PROGNOSIS; TOOL;
D O I
10.1007/s11604-024-01608-1
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.
引用
收藏
页码:1100 / 1109
页数:10
相关论文
共 50 条
  • [31] Deep Learning for Combined Water Quality Testing and Crop Recommendation
    Alkhudaydi, Tahani
    Albalawi, Maram Qasem
    Alanazi, Jamelah Sanad
    Al-Anazi, Wejdan
    Alfarshouti, Rahaf Mansour
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 447 - 455
  • [32] Development of fashion recommendation system using collaborative deep learning
    Lee, Gwang Han
    Kim, Sungmin
    Park, Chang Kyu
    INTERNATIONAL JOURNAL OF CLOTHING SCIENCE AND TECHNOLOGY, 2022, 34 (05) : 732 - 744
  • [33] Machine Learning and Deep Learning in Detection of Neonatal Seizures: A Systematic Review
    Naz, Ruya
    Orsal, Ozlem
    JOURNAL OF EVALUATION IN CLINICAL PRACTICE, 2025, 31 (03)
  • [34] Predictive Medicine for Salivary Gland Tumours Identification Through Deep Learning
    Prezioso, Edoardo
    Izzo, Stefano
    Giampaolo, Fabio
    Piccialli, Francesco
    Dell'Aversana Orabona, Giovanni
    Cuocolo, Renato
    Abbate, Vincenzo
    Ugga, Lorenzo
    Califano, Luigi
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (10) : 4869 - 4879
  • [35] Exploring the effectiveness of artificial intelligence, machine learning and deep learning in trauma triage: A systematic review and meta-analysis
    Adebayo, Oluwasemilore
    Bhuiyan, Zunira Areeba
    Ahmed, Zubair
    DIGITAL HEALTH, 2023, 9
  • [36] Music Emotion Recognition Based on Deep Learning: A Review
    Jiang, Xingguo
    Zhang, Yuchao
    Lin, Guojun
    Yu, Ling
    IEEE ACCESS, 2024, 12 : 157716 - 157745
  • [37] Deep Reinforcement Learning for Traffic Signal Control: A Review
    Rasheed, Faizan
    Yau, Kok-Lim Alvin
    Noor, Rafidah Md.
    Wu, Celimuge
    Low, Yeh-Ching
    IEEE ACCESS, 2020, 8 : 208016 - 208044
  • [38] Deep learning in precision medicine and focus on glioma
    Liu, Yihao
    Wu, Minghua
    BIOENGINEERING & TRANSLATIONAL MEDICINE, 2023, 8 (05)
  • [39] Review of Visualization Approaches in Deep Learning Models of Glaucoma
    Gu, Byoungyoung
    Sidhu, Sophia
    Weinreb, Robert N.
    Christopher, Mark
    Zangwill, Linda M.
    Baxter, Sally L.
    ASIA-PACIFIC JOURNAL OF OPHTHALMOLOGY, 2023, 12 (04): : 392 - 401
  • [40] Rise of Deep Learning Clinical Applications and Challenges in Omics Data: A Systematic Review
    Mohammed, Mazin Abed
    Abdulkareem, Karrar Hameed
    Dinar, Ahmed M.
    Zapirain, Begonya Garcia
    DIAGNOSTICS, 2023, 13 (04)