AAPM task group report 273: Recommendations on best practices for AI and machine learning for computer-aided diagnosis in medical imaging

被引：51

作者：

Hadjiiski, Lubomir ^{[1
]}

Cha, Kenny ^{[2
]}

Chan, Heang-Ping ^{[1
]}

Drukker, Karen ^{[3
]}

Morra, Lia ^{[4
]}

Nappi, Janne J. ^{[5
,6
]}

Sahiner, Berkman ^{[2
]}

Yoshida, Hiroyuki ^{[5
,6
]}

Chen, Quan ^{[7
]}

Deserno, Thomas M. ^{[8
,9
]}

Greenspan, Hayit ^{[10
,11
]}

Huisman, Henkjan ^{[12
]}

Huo, Zhimin ^{[13
]}

Mazurchuk, Richard ^{[14
]}

Petrick, Nicholas ^{[2
]}

Regge, Daniele ^{[15
,16
]}

Samala, Ravi ^{[2
]}

Summers, Ronald M. ^{[17
]}

Suzuki, Kenji ^{[18
]}

Tourassi, Georgia ^{[19
]}

Vergara, Daniel ^{[20
]}

Armato, Samuel G., III ^{[3
]}

机构：

[1] Univ Michigan, Dept Radiol, 1500 E Med Ctr Dr,MIB C476, Ann Arbor, MI 48109 USA

[2] US FDA, Silver Spring, MD USA

[3] Univ Chicago, Dept Radiol, Chicago, IL 60637 USA

[4] Politecn Torino, Dept Control & Comp Engn, Turin, Italy

[5] Massachusetts Gen Hosp, Dept Radiol, 3D Imaging Res, Boston, MA 02114 USA

[6] Harvard Med Sch, Boston, MA 02115 USA

[7] Univ Kentucky, Dept Radiat Med, Lexington, KY USA

[8] TU Braunschweig, Peter L Reichertz Inst Med Informat, Braunschweig, Germany

[9] Hannover Med Sch, Braunschweig, Germany

[10] Tel Aviv Univ, Fac Engn, Dept Biomed Engn, Tel Aviv, Israel

[11] Tel Aviv Univ, Ichan Sch Med, Dept Radiol, New York, NY USA

[12] Radboud Univ Nijmegen, Med Ctr, Radboud Inst Hlth Sci, Nijmegen, Netherlands

[13] Tencent Amer, Palo Alto, CA USA

[14] NCI, Div Canc Prevent, NIH, Bethesda, MD 20892 USA

[15] FPO IRCCS, Radiol Unit, Candiolo Canc Inst, Candiolo, Italy

[16] Univ Turin, Dept Surg Sci, Turin, Italy

[17] NIH, Radiol & Imaging Sci, Clin Ctr, Bldg 10, Bethesda, MD 20892 USA

[18] Tokyo Inst Technol, Inst Innovat Res, Tokyo, Japan

[19] Oak Ridge Natl Lab, Oak Ridge, TN USA

[20] Yale New Haven Hosp, Dept Radiol, New Haven, CT USA

来源：

MEDICAL PHYSICS | 2023年 / 50卷 / 02期

基金：

美国国家卫生研究院;

关键词：

AI; best practices; CAD; decision support systems; image analysis; machine learning; medical Imaging; model development; reference standards; BREAST-CANCER DETECTION; CONVOLUTIONAL NEURAL-NETWORKS; SCREENING MAMMOGRAPHY; ARTIFICIAL-INTELLIGENCE; MASS-DETECTION; CLASSIFIER PERFORMANCE; RADIOLOGISTS DETECTION; OBSERVER-PERFORMANCE; VISUAL EXPLANATIONS; DATA AUGMENTATION;

D O I：

10.1002/mp.16188

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional " machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods.Numerous studies have been published to date on the development of machine learning tools for computer-aided, or AI-assisted, clinical tasks. However, most of these machine learning models are not ready for clinical deployment. It is of paramount importance to ensure that a clinical decision support tool undergoes proper training and rigorous validation of its generalizability and robustness before adoption for patient care in the clinic.To address these important issues, the American Association of Physicists in Medicine (AAPM) Computer-Aided Image Analysis Subcommittee (CADSC) is charged, in part, to develop recommendations on practices and standards for the development and performance assessment of computer-aided decision support systems. The committee has previously published two opinion papers on the evaluation of CAD systems and issues associated with user training and quality assurance of these systems in the clinic. With machine learning techniques continuing to evolve and CAD applications expanding to new stages of the patient care process, the current task group report considers the broader issues common to the development of most, if not all, CAD-AI applications and their translation from the bench to the clinic. The goal is to bring attention to the proper training and validation of machine learning algorithms that may improve their generalizability and reliability and accelerate the adoption of CAD-AI systems for clinical decision support.

引用

页码：E1 / E24

页数：24

共 192 条

[1] Virtual Imaging Trials for Coronavirus Disease (COVID-19) [J].

Abadi, Ehsan ;

Segars, W. Paul ;

Chalian, Hamid ;

Samei, Ehsan .

AMERICAN JOURNAL OF ROENTGENOLOGY, 2021, 216 (02) :362-368

[2] Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis [J].

Aggarwal, Ravi ;

Sounderajah, Viknesh ;

Martin, Guy ;

Ting, Daniel S. W. ;

Karthikesalingam, Alan ;

King, Dominic ;

Ashrafian, Hutan ;

Darzi, Ara .

NPJ DIGITAL MEDICINE, 2021, 4 (01)

[3] Adjusting for multiple testing when reporting research results: The Bonferroni vs Holm methods [J].

Aickin, M ;

Gensler, H .

AMERICAN JOURNAL OF PUBLIC HEALTH, 1996, 86 (05) :726-728

[4]

[Anonymous], 2002, Statistical Methods in Diagnostic Medicine

[5] The lung image database consortium (LIDC): Ensuring the integrity of expert-defined "truth" [J].

Armato, Samuel G., III ;

Roberts, Rachael Y. ;

McNitt-Gray, Michael F. ;

Meyer, Charles R. ;

Reeves, Anthony P. ;

McLennan, Geoffrey ;

Engelmann, Roger M. ;

Bland, Peyton H. ;

Aberle, Denise R. ;

Kazerooni, Ella A. ;

MacMahon, Heber ;

van Beek, Edwin J. R. ;

Yankelevitz, David ;

Croft, Barbara Y. ;

Clarke, Laurence P. .

ACADEMIC RADIOLOGY, 2007, 14 (12) :1455-1463

[6] Assessment of Radiologist Performance in the Detection of Lung Nodules: Dependence on the Definition of "Truth" [J].

Armato, Samuel G., III ;

Roberts, Rachael Y. ;

Kocherginsky, Masha ;

Aberle, Denise R. ;

Kazerooni, Ella A. ;

MacMahon, Heber ;

van Beek, Edwin J. R. ;

Yankelevitz, David ;

McLennan, Geoffrey ;

McNitt-Gray, Michael F. ;

Meyer, Charles R. ;

Reeves, Anthony P. ;

Caligiuri, Philip ;

Quint, Leslie E. ;

Sundaram, Baskaran ;

Croft, Barbara Y. ;

Clarke, Laurence P. .

ACADEMIC RADIOLOGY, 2009, 16 (01) :28-38

[7] Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging [J].

Arun, Nishanth ;

Gaw, Nathan ;

Singh, Praveer ;

Chang, Ken ;

Aggarwal, Mehak ;

Chen, Bryan ;

Hoebel, Katharina ;

Gupta, Sharut ;

Patel, Jay ;

Gidwani, Mishka ;

Adebayo, Julius ;

Li, Matthew D. ;

Kalpathy-Cramer, Jayashree .

RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2021, 3 (06)

[8] Free DICOM de-identification tools in clinical research: functioning and safety of patient privacy [J].

Aryanto, K. Y. E. ;

Oudkerk, M. ;

van Ooijen, P. M. A. .

EUROPEAN RADIOLOGY, 2015, 25 (12) :3685-3695

[9]

Badawy A, 2018, J COMPUT SOC SCI, V1, P453, DOI [10.1007/s42001-018-0015-z, 10.1001/jamanetworkopen.2018.5474]

[10] Robustness and Reproducibility of Radiomics in Magnetic Resonance Imaging A Phantom Study [J].

Baessler, Bettina ;

Weiss, Kilian ;

dos Santos, Daniel Pinto .

INVESTIGATIVE RADIOLOGY, 2019, 54 (04) :221-228

← 1 2 3 4 5 6 7 8 9 10 →