Feature Robustness and Sex Differences in Medical Imaging: A Case Study in MRI-Based Alzheimer's Disease Detection

被引:14
作者
Petersen, Eike [1 ]
Feragen, Aasa [1 ]
Zemsch, Maria Luise Da Costa [1 ]
Henriksen, Anders [1 ]
Christensen, Oskar Eiler Wiese [1 ]
Ganz, Melanie [2 ,3 ]
机构
[1] Tech Univ Denmark DTU Compute, Lyngby, Denmark
[2] Univ Copenhagen, Dept Comp Sci, Copenhagen, Denmark
[3] Rigshosp, Neurobiol Res Unit, Copenhagen, Denmark
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT I | 2022年 / 13431卷
基金
美国国家卫生研究院; 加拿大健康研究院;
关键词
Deep learning; MRI; Alzheimer's disease; Robustness; LOGISTIC-REGRESSION; GENDER;
D O I
10.1007/978-3-031-16431-6_9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Convolutional neural networks have enabled significant improvements in medical image-based diagnosis. It is, however, increasingly clear that these models are susceptible to performance degradation when facing spurious correlations and dataset shift, leading, e.g., to underperformance on underrepresented patient groups. In this paper, we compare two classification schemes on the ADNI MRI dataset: a simple logistic regression model using manually selected volumetric features, and a convolutional neural network trained on 3D MRI data. We assess the robustness of the trained models in the face of varying dataset splits, training set sex composition, and stage of disease. In contrast to earlier work in other imaging modalities, we do not observe a clear pattern of improved model performance for the majority group in the training dataset. Instead, while logistic regression is fully robust to dataset composition, we find that CNN performance is generally improved for both male and female subjects when including more female subjects in the training dataset. We hypothesize that this might be due to inherent differences in the pathology of the two sexes. Moreover, in our analysis, the logistic regression model outperforms the 3D CNN, emphasizing the utility of manual feature specification based on prior knowledge, and the need for more robust automatic feature selection.
引用
收藏
页码:88 / 98
页数:11
相关论文
共 29 条
  • [1] Deep learning encodes robust discriminative neuroimaging representations to outperform standard machine learning
    Abrol, Anees
    Fu, Zening
    Salman, Mustafa
    Silva, Rogers
    Du, Yuhui
    Plis, Sergey
    Calhoun, Vince
    [J]. NATURE COMMUNICATIONS, 2021, 12 (01)
  • [2] Arjovsky M, 2020, Arxiv, DOI arXiv:1907.02893
  • [3] SPM: A history
    Ashburner, John
    [J]. NEUROIMAGE, 2012, 62 (02) : 791 - 800
  • [4] Azulay A, 2019, J MACH LEARN RES, V20
  • [5] Banerjee I., 2021, PREPRINT
  • [6] Logistic regression and machine learning predicted patient mortality from large sets of diagnosis codes comparably
    Cowling, Thomas E.
    Cromwell, David A.
    Bellot, Alexis
    Sharples, Linda D.
    van der Meulen, Jan
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2021, 133 : 43 - 52
  • [7] D'Amour A, 2020, Arxiv, DOI [arXiv:2011.03395, DOI 10.48550/ARXIV.2011.03395]
  • [8] Falcon W., 2019, The PyTorch Lightning team
  • [9] Fernandez A., 2018, LEARNING IMBALANCED, P47, DOI [10.1007/978-3-319-98074-43, DOI 10.1007/978-3-319-98074-43, 10.1007/978-3-319-98074-4_3, DOI 10.1007/978-3-319-98074-4_3]
  • [10] FreeSurfer
    Fischl, Bruce
    [J]. NEUROIMAGE, 2012, 62 (02) : 774 - 781