Self-supervised learning for chest computed tomography: training strategies and effect on downstream applications

被引:0
作者
Tariq, Amara [1 ]
Ramasamy, Gokul [1 ]
Patel, Bhavik [1 ,2 ,3 ]
Banerjee, Imon [1 ,2 ,3 ,4 ]
机构
[1] Mayo Clin Arizona, Arizona Adv AI Hub, Phoenix, AZ 85054 USA
[2] Mayo Clin Arizona, Dept Radiol, Phoenix, AZ USA
[3] Arizona State Univ, Sch Comp & Augmented Intelligence, Tempe, AZ USA
[4] Mayo Clin, Dept Artificial Intelligence & Informat, Scottsdale, AZ USA
关键词
biomedical imaging; computed tomography; image processing; self-supervised learning;
D O I
10.1117/1.JMI.11.6.064003
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Purpose Self-supervised pre-training can reduce the amount of labeled training data needed by pre-learning fundamental visual characteristics of the medical imaging data. We investigate several self-supervised training strategies for chest computed tomography exams and their effects on downstream applications. Approach We benchmark five well-known self-supervision strategies (masked image region prediction, next slice prediction, rotation prediction, flip prediction, and denoising) on 15 M chest computed tomography (CT) slices collected from four sites of the Mayo Clinic enterprise, United States. These models were evaluated for two downstream tasks on public datasets: pulmonary embolism (PE) detection (classification) and lung nodule segmentation. Image embeddings generated by these models were also evaluated for prediction of patient age, race, and gender to study inherent biases in models' understanding of chest CT exams. Results The use of pre-training weights especially masked region prediction-based weights, improved performance, and reduced computational effort needed for downstream tasks compared with task-specific state-of-the-art (SOTA) models. Performance improvement for PE detection was observed for training dataset sizes as large as similar to 380 K with a maximum gain of 5% over SOTA. The segmentation model initialized with pre-training weights learned twice as fast as the randomly initialized model. While gender and age predictors built using self-supervised training weights showed no performance improvement over randomly initialized predictors, the race predictor experienced a 10% performance boost when using self-supervised training weights. Conclusion We released self-supervised models and weights under an open-source academic license. These models can then be fine-tuned with limited task-specific annotated data for a variety of downstream imaging tasks, thus accelerating research in biomedical imaging informatics.
引用
收藏
页数:19
相关论文
共 39 条
  • [1] "Shortcuts" Causing Bias in Radiology Artificial Intelligence: Causes, Evaluation, and Mitigation
    Banerjee, Imon
    Bhattacharjee, Kamanasish
    Burns, John L.
    Trivedi, Hari
    Purkayastha, Saptarshi
    Seyyed-Kalantari, Laleh
    Patel, Bhavik N.
    Shiradkar, Rakesh
    Gichoya, Judy
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2023, 20 (09) : 842 - 851
  • [2] Bommasani R., 2021, arXiv, DOI DOI 10.48550/ARXIV.2108.07258
  • [3] AMAE: Adaptation of Pre-trained Masked Autoencoder for Dual-Distribution Anomaly Detection in Chest X-Rays
    Bozorgtabar, Behzad
    Mahapatra, Dwarikanath
    Thiran, Jean-Philippe
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I, 2023, 14220 : 195 - 205
  • [4] Detecting shortcut learning for fair medical AI using shortcut testing
    Brown, Alexander
    Tomasev, Nenad
    Freyberg, Jan
    Liu, Yuan
    Karthikesalingam, Alan
    Schrouff, Jessica
    [J]. NATURE COMMUNICATIONS, 2023, 14 (01)
  • [5] Chen T, 2020, PMLR, V119, P1597
  • [6] CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning
    Cho, Kyungjin
    Kim, Ki Duk
    Nam, Yujin
    Jeong, Jiheon
    Kim, Jeeyoung
    Choi, Changyong
    Lee, Soyoung
    Lee, Jun Soo
    Woo, Seoyeon
    Hong, Gil-Sun
    Seo, Joon Beom
    Kim, Namkug
    [J]. JOURNAL OF DIGITAL IMAGING, 2023, 36 (03) : 902 - 910
  • [7] Chowdhery A, 2023, J MACH LEARN RES, V24
  • [8] Efficient adversarial debiasing with concept activation vector - Medical image case-studies
    Correa, Ramon
    Pahwa, Khushbu
    Patel, Bhavik
    Vachon, Celine M.
    Gichoya, Judy W.
    Banerjee, Imon
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 149
  • [9] A robust two-step adversarial debiasing with partial learning - medical image case-studies
    Correa, Ramon
    Jeong, Jiwoong Jason
    Patel, Bhavik
    Trivedi, Hari
    Gichoya, Judy W.
    Banerjee, Imon
    [J]. MEDICAL IMAGING 2023, 2023, 12469
  • [10] Adversarial Debiasing techniques towards 'fair' skin lesion classification
    Correa-Medero, Ramon L.
    Patel, Bhavik
    Banerjee, Imon
    [J]. 2023 11TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING, NER, 2023,