High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: Application to invasive breast cancer detection

被引:80
作者
Cruz-Roa, Angel [1 ,2 ]
Gilmore, Hannah [3 ]
Basavanhally, Ajay [4 ]
Feldman, Michael [5 ]
Ganesan, Shridar [6 ]
Shih, Natalie [5 ]
Tomaszewski, John [7 ]
Madabhushi, Anant [8 ]
Gonzalez, Fabio [2 ]
机构
[1] Univ Llanos, Sch Engn, Villavicencio, Meta, Colombia
[2] Univ Nacl Colombia, Dept Comp Syst & Ind Engn, Bogota, Cundinamarca, Colombia
[3] Univ Hosp Case Med Ctr, Cleveland, OH USA
[4] Inspirata Inc, Tampa, FL USA
[5] Hosp Univ Penn, Philadelphia, PA USA
[6] Canc Inst New Jersey, New Brunswick, NJ USA
[7] SUNY Buffalo, Univ Buffalo, Buffalo, NY USA
[8] Case Western Reserve Univ, Cleveland, OH 44106 USA
基金
美国国家卫生研究院;
关键词
INFORMATICS; FRAMEWORK; GRADE;
D O I
10.1371/journal.pone.0196828
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Precise detection of invasive cancer on whole-slide images (WSI) is a critical first step in digital pathology tasks of diagnosis and grading. Convolutional neural network (CNN) is the most popular representation learning method for computer vision tasks, which have been successfully applied in digital pathology, including tumor and mitosis detection. However, CNNs are typically only tenable with relatively small image sizes (200 x 200 pixels). Only recently, Fully convolutional networks (FCN) are able to deal with larger image sizes (500 x 500 pixels) for semantic segmentation. Hence, the direct application of CNNs to WSI is not computationally feasible because for a WSI, a CNN would require billions or trillions of parameters. To alleviate this issue, this paper presents a novel method, High-throughput Adaptive Sampling for whole-slide Histopathology Image analysis (HASHI), which involves: i) a new efficient adaptive sampling method based on probability gradient and quasi-Monte Carlo sampling, and, ii) a powerful representation learning classifier based on CNNs. We applied HASHI to automated detection of invasive breast cancer on WSI. HASHI was trained and validated using three different data cohorts involving near 500 cases and then independently tested on 195 studies from The Cancer Genome Atlas. The results show that (1) the adaptive sampling method is an effective strategy to deal with WSI without compromising prediction accuracy by obtaining comparative results of a dense sampling (similar to 6 million of samples in 24 hours) with far fewer samples (similar to 2,000 samples in 1 minute), and (2) on an independent test dataset, HASHI is effective and robust to data from multiple sites, scanners, and platforms, achieving an average Dice coefficient of 76%.
引用
收藏
页数:23
相关论文
共 63 条
[1]  
Cruz-Roa AA, 2013, LECT NOTES COMPUT SC, V8150, P403, DOI 10.1007/978-3-642-40763-5_50
[2]  
[Anonymous], 2006, SAMPLING STRATEGIES
[3]  
[Anonymous], NOTE STABILITY DISCR
[4]  
[Anonymous], CORR
[5]  
[Anonymous], IEEE T PATTERN ANAL
[6]  
[Anonymous], SPIE MED IMAGING
[7]  
[Anonymous], SPIE MED IMAGING
[8]  
[Anonymous], 2011, BIGLEARN NIPS WORKSH
[9]  
[Anonymous], 2015, PROC CVPR IEEE
[10]  
[Anonymous], THESIS