Benchmarking the generalizability of brain age models: Challenges posed by scanner variance and prediction bias

被引:16
作者
Jirsaraie, Robert J. [1 ]
Kaufmann, Tobias [2 ,3 ,4 ]
Bashyam, Vishnu [5 ]
Erus, Guray [5 ]
Luby, Joan L. [6 ]
Westlye, Lars T. [3 ,4 ,7 ,8 ]
Davatzikos, Christos [5 ]
Barch, Deanna M. [9 ]
Sotiras, Aristeidis [10 ]
机构
[1] Washington Univ, Div Computat & Data Sci, One Brookings Dr,CB 1125, St Louis, MO 63130 USA
[2] Univ Tubingen, Dept Psychiat & Psychotherapy, Tubingen Ctr Mental Hlth, Tubingen, Germany
[3] Univ Oslo, Oslo Univ Hosp, Norwegian Ctr Mental Disorders Res, Div Mental Hlth & Addict, Oslo, Norway
[4] Univ Oslo, Inst Clin Med, Oslo, Norway
[5] Univ Penn, Ctr Biomed Image Comp & Analyt, Dept Radiol, Philadelphia, PA 19104 USA
[6] Washington Univ, Dept Psychiat, St Louis, MO 63130 USA
[7] Univ Oslo, Dept Psychol, Oslo, Norway
[8] Univ Oslo, KG Jebsen Ctr Neurodev Disorders, Oslo, Norway
[9] Washington Univ, Dept Psychol & Brain Sci, St Louis, MO 63130 USA
[10] Washington Univ, Dept Radiol, Sch Med St Louis, St Louis, MO 63130 USA
基金
美国国家科学基金会;
关键词
brain age; brain development; computational neuroscience; generalizability; machine learning; PATTERNS;
D O I
10.1002/hbm.26144
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Machine learning has been increasingly applied to neuroimaging data to predict age, deriving a personalized biomarker with potential clinical applications. The scientific and clinical value of these models depends on their applicability to independently acquired scans from diverse sources. Accordingly, we evaluated the generalizability of two brain age models that were trained across the lifespan by applying them to three distinct early-life samples with participants aged 8-22 years. These models were chosen based on the size and diversity of their training data, but they also differed greatly in their processing methods and predictive algorithms. Specifically, one brain age model was built by applying gradient tree boosting (GTB) to extracted features of cortical thickness, surface area, and brain volume. The other model applied a 2D convolutional neural network (DBN) to minimally preprocessed slices of T1-weighted scans. Additional model variants were created to understand how generalizability changed when each model was trained with data that became more similar to the test samples in terms of age and acquisition protocols. Our results illustrated numerous trade-offs. The GTB predictions were relatively more accurate overall and yielded more reliable predictions when applied to lower quality scans. In contrast, the DBN displayed the most utility in detecting associations between brain age gaps and cognitive functioning. Broadly speaking, the largest limitations affecting generalizability were acquisition protocol differences and biased brain age estimates. If such confounds could eventually be removed without post-hoc corrections, brain age predictions may have greater utility as personalized biomarkers of healthy aging.
引用
收藏
页码:1118 / 1128
页数:11
相关论文
共 44 条
  • [41] The Lifespan Human Connectome Project in Development: A large-scale study of brain connectivity development in 5-21 year olds
    Somerville, Leah H.
    Bookheimer, Susan Y.
    Buckner, Randy L.
    Burgess, Gregory C.
    Curtiss, Sandra W.
    Dapretto, Mirella
    Elam, Jennifer Stine
    Gaffrey, Michael S.
    Harms, Michael P.
    Hodge, Cynthia
    Kandala, Sridhar
    Kastman, Erik K.
    Nichols, Thomas E.
    Schlaggar, Bradley L.
    Smith, Stephen M.
    Thomas, Kathleen M.
    Yacoub, Essa
    Van Essen, David C.
    Barch, Deanna M.
    [J]. NEUROIMAGE, 2018, 183 : 456 - 468
  • [42] A Multidimensional Neural Maturation Index Reveals Reproducible Developmental Patterns in Children and Adolescents
    Truelove-Hill, Monica
    Erus, Guray
    Bashyam, Vishnu
    Varol, Erdem
    Sako, Chiharu
    Gur, Ruben C.
    Gur, Raquel E.
    Koutsouleris, Nikolaos
    Zhuo, Chuanjun
    Fan, Yong
    Wolf, Daniel H.
    Satterthwaite, Theodore D.
    Davatzikos, Christos
    [J]. JOURNAL OF NEUROSCIENCE, 2020, 40 (06) : 1265 - 1275
  • [43] Assessing and tuning brain decoders: Cross-validation, caveats, and guidelines
    Varoquaux, Gael
    Raamana, Pradeep Reddy
    Engemann, Denis A.
    Hoyos-Idrobo, Andres
    Schwartz, Yannick
    Thirion, Bertrand
    [J]. NEUROIMAGE, 2017, 145 : 166 - 179
  • [44] Cognition assessment using the NIH Toolbox
    Weintraub, Sandra
    Dikmen, Sureyya S.
    Heaton, Robert K.
    Tulsky, David S.
    Zelazo, Philip D.
    Bauer, Patricia J.
    Carlozzi, Noelle E.
    Slotkin, Jerry
    Blitz, David
    Wallner-Allen, Kathleen
    Fox, Nathan A.
    Beaumont, Jennifer L.
    Mungas, Dan
    Nowinski, Cindy J.
    Richler, Jennifer
    Deocampo, Joanne A.
    Anderson, Jacob E.
    Manly, Jennifer J.
    Borosh, Beth
    Havlik, Richard
    Conway, Kevin
    Edwards, Emmeline
    Freund, Lisa
    King, Jonathan W.
    Moy, Claudia
    Witt, Ellen
    Gershon, Richard C.
    [J]. NEUROLOGY, 2013, 80 : S54 - S64