Towards a general-purpose foundation model for computational pathology

被引:256
作者
Chen, Richard J. [1 ,2 ,3 ,4 ,5 ]
Ding, Tong [1 ,6 ]
Lu, Ming Y. [1 ,2 ,3 ,4 ,7 ]
Williamson, Drew F. K. [1 ,2 ,3 ]
Jaume, Guillaume [1 ,2 ,3 ,4 ]
Song, Andrew H. [1 ,2 ,3 ,4 ]
Chen, Bowen [1 ,2 ]
Zhang, Andrew [1 ,2 ,3 ,4 ,8 ]
Shao, Daniel [1 ,2 ,3 ,4 ,8 ]
Shaban, Muhammad [1 ,2 ,3 ,4 ]
Williams, Mane [1 ,2 ,3 ,4 ,5 ]
Oldenburg, Lukas [1 ]
Weishaupt, Luca L. [1 ,2 ,3 ,4 ,8 ]
Wang, Judy J. [1 ]
Vaidya, Anurag [1 ,2 ,3 ,4 ,8 ]
Le, Long Phi [2 ,8 ]
Gerber, Georg [1 ]
Sahai, Sharifa [1 ,2 ,3 ,4 ,9 ]
Williams, Walt [1 ,6 ]
Mahmood, Faisal [1 ,2 ,3 ,4 ,10 ]
机构
[1] Harvard Med Sch, Brigham & Womens Hosp, Dept Pathol, Boston, MA 02115 USA
[2] Harvard Med Sch, Massachusetts Gen Hosp, Dept Pathol, Boston, MA 02115 USA
[3] Broad Inst Harvard & MIT, Canc Program, Cambridge, MA 02142 USA
[4] Dana Farber Canc Inst, Canc Data Sci Program, Boston, MA 02215 USA
[5] Harvard Med Sch, Dept Biomed Informat, Boston, MA USA
[6] Harvard Univ, Harvard John A Paulson Sch Engn & Appl Sci, Cambridge, MA USA
[7] Massachusetts Inst Technol MIT, Elect Engn & Comp Sci, Cambridge, MA USA
[8] Harvard MIT, Hlth Sci & Technol, Cambridge, MA USA
[9] Harvard Univ, Dept Syst Biol, Cambridge, MA USA
[10] Harvard Univ, Harvard Data Sci Initiat, Cambridge, MA 02138 USA
基金
美国国家卫生研究院;
关键词
SOMATIC GENOMIC LANDSCAPE; ARTIFICIAL-INTELLIGENCE; CANCER; ADENOCARCINOMAS; BIOPSIES; FEATURES; SYSTEM;
D O I
10.1038/s41591-024-02857-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Quantitative evaluation of tissue images is crucial for computational pathology (CPath) tasks, requiring the objective characterization of histopathological entities from whole-slide images (WSIs). The high resolution of WSIs and the variability of morphological features present significant challenges, complicating the large-scale annotation of data for high-performance applications. To address this challenge, current efforts have proposed the use of pretrained image encoders through transfer learning from natural image datasets or self-supervised learning on publicly available histopathology datasets, but have not been extensively developed and evaluated across diverse tissue types at scale. We introduce UNI, a general-purpose self-supervised model for pathology, pretrained using more than 100 million images from over 100,000 diagnostic H&E-stained WSIs (>77 TB of data) across 20 major tissue types. The model was evaluated on 34 representative CPath tasks of varying diagnostic difficulty. In addition to outperforming previous state-of-the-art models, we demonstrate new modeling capabilities in CPath such as resolution-agnostic tissue classification, slide classification using few-shot class prototypes, and disease subtyping generalization in classifying up to 108 cancer types in the OncoTree classification system. UNI advances unsupervised representation learning at scale in CPath in terms of both pretraining data and downstream evaluation, enabling data-efficient artificial intelligence models that can generalize and transfer to a wide range of diagnostically challenging tasks and clinical workflows in anatomic pathology.
引用
收藏
页码:850 / 862
页数:13
相关论文
共 174 条
[1]   Deep Learning-Based Mapping of Tumor Infiltrating Lymphocytes in Whole Slide Images of 23 Types of Cancer [J].
Abousamra, Shahira ;
Gupta, Rajarsi ;
Hou, Le ;
Batiste, Rebecca ;
Zhao, Tianhao ;
Shankar, Anand ;
Rao, Arvind ;
Chen, Chao ;
Samaras, Dimitris ;
Kurc, Tahsin ;
Saltz, Joel .
FRONTIERS IN ONCOLOGY, 2022, 11
[2]   A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer [J].
Amgad, Mohamed ;
Hodge, James M. ;
Elsebaie, Maha A. T. ;
Bodelon, Clara ;
Puvanesarajah, Samantha ;
Gutman, David A. ;
Siziopikou, Kalliopi P. ;
Goldstein, Jeffery A. ;
Gaudet, Mia M. ;
Teras, Lauren R. ;
Cooper, Lee A. D. .
NATURE MEDICINE, 2024, 30 (01) :85-+
[3]   The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[4]   BACH: Grand challenge on breast cancer histology images [J].
Aresta, Guilherme ;
Araujo, Teresa ;
Kwok, Scotty ;
Chennamsetty, Sai Saketh ;
Safwan, Mohammed ;
Alex, Varghese ;
Marami, Bahram ;
Prastawa, Marcel ;
Chan, Monica ;
Donovan, Michael ;
Fernandez, Gerardo ;
Zeineh, Jack ;
Kohl, Matthias ;
Walz, Christoph ;
Ludwig, Florian ;
Braunewell, Stefan ;
Baust, Maximilian ;
Quoc Dang Vu ;
Minh Nguyen Nhat To ;
Kim, Eal ;
Kwak, Jin Tae ;
Galal, Sameh ;
Sanchez-Freire, Veronica ;
Brancati, Nadia ;
Frucci, Maria ;
Riccio, Daniel ;
Wang, Yaqi ;
Sun, Lingling ;
Ma, Kaiqiang ;
Fang, Jiannan ;
Kone, Ismael ;
Boulmane, Lahsen ;
Campilho, Aurelio ;
Eloy, Catarina ;
Polonia, Antonio ;
Aguiar, Paulo .
MEDICAL IMAGE ANALYSIS, 2019, 56 :122-139
[5]   Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging [J].
Azizi, Shekoofeh ;
Culp, Laura ;
Freyberg, Jan ;
Mustafa, Basil ;
Baur, Sebastien ;
Kornblith, Simon ;
Chen, Ting ;
Tomasev, Nenad ;
Mitrovic, Jovana ;
Strachan, Patricia ;
Mahdavi, S. Sara ;
Wulczyn, Ellery ;
Babenko, Boris ;
Walker, Megan ;
Loh, Aaron ;
Chen, Po-Hsuan Cameron ;
Liu, Yuan ;
Bavishi, Pinal ;
McKinney, Scott Mayer ;
Winkens, Jim ;
Roy, Abhijit Guha ;
Beaver, Zach ;
Ryan, Fiona ;
Krogue, Justin ;
Etemadi, Mozziyar ;
Telang, Umesh ;
Liu, Yun ;
Peng, Lily ;
Corrado, Greg S. ;
Webster, Dale R. ;
Fleet, David ;
Hinton, Geoffrey ;
Houlsby, Neil ;
Karthikesalingam, Alan ;
Norouzi, Mohammad ;
Natarajan, Vivek .
NATURE BIOMEDICAL ENGINEERING, 2023, 7 (06) :756-+
[6]  
Balestriero R., 2023, PREPRINT, DOI DOI 10.48550/ARXIV.2304.12210
[7]   From Detection of Individual Metastases to Classification of Lymph Node Status at the Patient Level: The CAMELYON17 Challenge [J].
Bandi, Peter ;
Geessink, Oscar ;
Manson, Quirine ;
van Dijk, Marcory ;
Balkenhol, Maschenka ;
Hermsen, Meyke ;
Bejnordi, Babak Ehteshami ;
Lee, Byungjae ;
Paeng, Kyunghyun ;
Zhong, Aoxiao ;
Li, Quanzheng ;
Zanjani, Farhad Ghazvinian ;
Zinger, Svitlana ;
Fukuta, Keisuke ;
Komura, Daisuke ;
Ovtcharov, Vlado ;
Cheng, Shenghua ;
Zeng, Shaoqun ;
Thagaard, Jeppe ;
Dahl, Anders B. ;
Lin, Huangjing ;
Chen, Hao ;
Jacobsson, Ludwig ;
Hedlund, Martin ;
Cetin, Melih ;
Halici, Eren ;
Jackson, Hunter ;
Chen, Richard ;
Both, Fabian ;
Franke, Joerg ;
Kusters-Vandevelde, Heidi ;
Vreuls, Willem ;
Bult, Peter ;
van Ginneken, Bram ;
van der Laak, Jeroen ;
Litjens, Geert .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (02) :550-560
[8]   UNITOPATHO, A LABELED HISTOPATHOLOGICAL DATASET FOR COLORECTAL POLYPS CLASSIFICATION AND ADENOMA DYSPLASIA GRADING [J].
Barbano, Carlo Alberto ;
Perlo, Daniele ;
Tartaglione, Enzo ;
Fiandrotti, Attilio ;
Bertero, Luca ;
Cassoni, Paola ;
Grangetto, Marco .
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :76-80
[9]   Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer [J].
Bejnordi, Babak Ehteshami ;
Veta, Mitko ;
van Diest, Paul Johannes ;
van Ginneken, Bram ;
Karssemeijer, Nico ;
Litjens, Geert ;
van der Laak, Jeroen A. W. M. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (22) :2199-2210
[10]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517