Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning

被引:305
作者
Chen, Richard J. [1 ]
Chen, Chengkuan [1 ]
Li, Yicong [1 ]
Chen, Tiffany Y. [1 ]
Trister, Andrew D. [2 ]
Krishnan, Rahul G. [3 ]
Mahmood, Faisal [1 ]
机构
[1] Broad Inst, BWH, Harvard, Cambridge, MA 02142 USA
[2] Bill & Melinda Gates Fdn, Seattle, WA USA
[3] Univ Toronto, Toronto, ON, Canada
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
关键词
TUMOR-INFILTRATING LYMPHOCYTES; CANCER;
D O I
10.1109/CVPR52688.2022.01567
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision Transformers (ViTs) and their multi-scale and hierarchical variations have been successful at capturing image representations but their use has been generally studied for low-resolution images (e.g. 256 x 256, 384 x 384). For gigapixel whole-slide imaging (WSI) in computational pathology, WSIs can be as large as 150000 x 150000 pixels at 20 x magnification and exhibit a hierarchical structure of visual tokens across varying resolutions: from 16 x 16 images capturing individual cells, to 4096 x 4096 images characterizing interactions within the tissue microenvironment. We introduce a new ViT architecture called the Hierarchical Image Pyramid Transformer (HIPT), which leverages the natural hierarchical structure inherent in WSIs using two levels of self-supervised learning to learn high-resolution image representations. HIPT is pretrained across 33 cancer types using 10,678 gigapixel WSIs, 408,218 4096 x 4096 images, and 104M 256 x 256 images. We benchmark HIPT representations on 9 slide-level tasks, and demonstrate that: 1) HIPT with hierarchical pretraining outperforms current state-of-the-art methods for cancer subtyping and survival prediction, 2) self-supervised ViTs are able to model important inductive biases about the hierarchical structure of phenotypes in the tumor microenvironment.
引用
收藏
页码:16123 / 16134
页数:12
相关论文
共 85 条
[1]   Geospatial immune variability illuminates differential evolution of lung adenocarcinoma [J].
AbdulJabbar, Khalid ;
Raza, Shan E. Ahmed ;
Rosenthal, Rachel ;
Jamal-Hanjani, Mariam ;
Veeriah, Selvaraju ;
Akarca, Ayse ;
Lund, Tom ;
Moore, David A. ;
Salgado, Roberto ;
Al Bakir, Maise ;
Zapata, Luis ;
Hiley, Crispin T. ;
Officer, Leah ;
Sereno, Marco ;
Smith, Claire Rachel ;
Loi, Sherene ;
Hackshaw, Allan ;
Marafioti, Teresa ;
Quezada, Sergio A. ;
McGranahan, Nicholas ;
Le Quesne, John ;
Swanton, Charles ;
Yuan, Yinyin .
NATURE MEDICINE, 2020, 26 (07) :1054-+
[2]  
Abousamra Shahira, 2021, P IEEE CVF INT C COM, P4005
[3]  
[Anonymous], 1983, READINGS COMPUTER VI
[4]   Effective gene expression prediction from sequence by integrating long-range interactions [J].
Avsec, Ziga ;
Agarwal, Vikram ;
Visentin, Daniel ;
Ledsam, Joseph R. ;
Grabska-Barwinska, Agnieszka ;
Taylor, Kyle R. ;
Assael, Yannis ;
Jumper, John ;
Kohli, Pushmeet ;
Kelley, David R. .
NATURE METHODS, 2021, 18 (10) :1196-+
[5]   The tumor microenvironment at a glance [J].
Balkwill, Frances R. ;
Capasso, Melania ;
Hagemann, Thorsten .
JOURNAL OF CELL SCIENCE, 2012, 125 (23) :5591-5596
[6]   STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human Mobility Responses to COVID- [J].
Bao, Han ;
Zhou, Xun ;
Xie, Yiqun ;
Li, Yanhua ;
Jia, Xiaowei .
2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, :1-10
[7]   Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival [J].
Beck, Andrew H. ;
Sangoi, Ankur R. ;
Leung, Samuel ;
Marinelli, Robert J. ;
Nielsen, Torsten O. ;
van de Vijver, Marc J. ;
West, Robert B. ;
van de Rijn, Matt ;
Koller, Daphne .
SCIENCE TRANSLATIONAL MEDICINE, 2011, 3 (108)
[8]   Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer [J].
Bejnordi, Babak Ehteshami ;
Veta, Mitko ;
van Diest, Paul Johannes ;
van Ginneken, Bram ;
Karssemeijer, Nico ;
Litjens, Geert ;
van der Laak, Jeroen A. W. M. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (22) :2199-2210
[9]   Self-Supervised Representation Learning using Visual Field Expansion on Digital Pathology [J].
Boyd, Joseph ;
Liashuha, Mykola ;
Deutsch, Eric ;
Paragios, Nikos ;
Christodoulidis, Stergios ;
Vakalopoulou, Maria .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, :639-647
[10]  
Brancati Nadia, 2021, ARXIV211104740