Development and validation of artificial intelligence-based prescreening of large-bowel biopsies taken in the UK and Portugal: a retrospective cohort study

被引:10
作者
Bilal, Mohsin [1 ,5 ]
Tsang, Yee Wah [3 ]
Ali, Mahmoud [3 ]
Graham, Simon [1 ,4 ]
Hero, Emily [3 ,6 ]
Wahab, Noorul [1 ]
Dodd, Katherine [3 ]
Sahota, Harvir [3 ]
Wu, Shaobin [7 ]
Lu, Wenqi [1 ]
Jahanifar, Mostafa [1 ]
Robinson, Andrew [3 ]
Azam, Ayesha [3 ]
Benes, Ksenija [8 ]
Nimir, Mohammed [3 ]
Hewitt, Katherine [3 ]
Bhalerao, Abhir [1 ]
Eldaly, Hesham
Raza, Shan E. Ahmed [1 ]
Gopalakrishnan, Kishore [3 ]
Minhas, Fayyaz [1 ]
Snead, David [2 ,3 ,4 ]
Rajpoot, Nasir [1 ,3 ,4 ,9 ]
机构
[1] Univ Warwick, Tissue Image Analyt Ctr, Dept Comp Sci, Coventry CV4 7AL, England
[2] Univ Warwick, Warwick Med Sch, Coventry, England
[3] Univ Hosp Coventry & Warwickshire Natl Hlth Serv T, Dept Pathol, Coventry, England
[4] Histofy, Birmingham, England
[5] Natl Univ Comp & Emerging Sci, Dept Artificial Intelligence & Data Sci, Islamabad, Pakistan
[6] Univ Hosp Leicester Natl Hlth Serv Trust, Dept Pathol, Leicester, England
[7] East Suffolk & North Essex Natl Hlth Serv Fdn Trus, Dept Pathol, Colchester, England
[8] Royal Wolverhampton Natl Hlth Serv Trust, Dept Gastroenterol, Wolverhampton, England
[9] Alan Turing Inst, London, England
基金
“创新英国”项目;
关键词
Decision making - Deep learning - Forecasting - Hospitals - Image enhancement - Iterative methods;
D O I
10.1016/S2589-7500(23)00148-6
中图分类号
R-058 [];
学科分类号
摘要
Background Histopathological examination is a crucial step in the diagnosis and treatment of many major diseases. Aiming to facilitate diagnostic decision making and improve the workload of pathologists, we developed an artificial intelligence (AI)-based prescreening tool that analyses whole-slide images (WSIs) of large-bowel biopsies to identify typical, non-neoplastic, and neoplastic biopsies. Methods This retrospective cohort study was conducted with an internal development cohort of slides acquired from a hospital in the UK and three external validation cohorts of WSIs acquired from two hospitals in the UK and one clinical laboratory in Portugal. To learn the differential histological patterns from digitised WSIs of large-bowel biopsy slides, our proposed weakly supervised deep-learning model (Colorectal AI Model for Abnormality Detection [CAIMAN]) used slide-level diagnostic labels and no detailed cell or region-level annotations. The method was developed with an internal development cohort of 5054 biopsy slides from 2080 patients that were labelled with corresponding diagnostic categories assigned by pathologists. The three external validation cohorts, with a total of 1536 slides, were used for independent validation of CAIMAN. Each WSI was classified into one of three classes (ie, typical, atypical non-neoplastic, and atypical neoplastic). Prediction scores of image tiles were aggregated into three prediction scores for the whole slide, one for its likelihood of being typical, one for its likelihood of being non-neoplastic, and one for its likelihood of being neoplastic. The assessment of the external validation cohorts was conducted by the trained and frozen CAIMAN model. To evaluate model performance, we calculated area under the convex hull of the receiver operating characteristic curve (AUROC), area under the precision-recall curve, and specificity compared with our previously published iterative draw and rank sampling (IDaRS) algorithm. We also generated heat maps and saliency maps to analyse and visualise the relationship between the WSI diagnostic labels and spatial features of the tissue microenvironment. The main outcome of this study was the ability of CAIMAN to accurately identify typical and atypical WSIs of colon biopsies, which could potentially facilitate automatic removing of typical biopsies from the diagnostic workload in clinics. Findings A randomly selected subset of all large bowel biopsies was obtained between Jan 1, 2012, and Dec 31, 2017. The AI training, validation, and assessments were done between Jan 1, 2021, and Sept 30, 2022. WSIs with diagnostic labels were collected between Jan 1 and Sept 30, 2022. Our analysis showed no statistically significant differences across prediction scores from CAIMAN for typical and atypical classes based on anatomical sites of the biopsy. At 0.99 sensitivity, CAIMAN (specificity 0.5592) was more accurate than an IDaRS-based weakly supervised WSI-classification pipeline (0.4629) in identifying typical and atypical biopsies on cross-validation in the internal development cohort (p<0.0001). At 0.99 sensitivity, CAIMAN was also more accurate than IDaRS for two external validation cohorts (p<0.0001), but not for a third external validation cohort (p=0.10). CAIMAN provided higher specificity than IDaRS at some high-sensitivity thresholds (0.7763 vs 0.6222 for 0.95 sensitivity, 0.7126 vs 0.5407 for 0.97 sensitivity, and 0.5615 vs 0.3970 for 0. 99 sensitivity on one of the external validation cohorts) and showed high classification performance in distinguishing between neoplastic biopsies (AUROC 0.9928, 95% CI 0<middle dot>9927-0<middle dot>9929), inflammatory biopsies (0.9658, 0<middle dot>9655-0<middle dot>9661), and atypical biopsies (0.9789, 0<middle dot>9786-0<middle dot>9792). On the three external validation cohorts, CAIMAN had AUROC values of 0.9431 (95% CI 0<middle dot>9165-0<middle dot>9697), 0.9576 (0<middle dot>9568-0<middle dot>9584), and 0.9636 (0<middle dot>9615-0<middle dot>9657) for the detection of atypical biopsies. Saliency maps supported the representation of disease heterogeneity in model predictions and its association with relevant histological features. Interpretation CAIMAN, with its high sensitivity in detecting atypical large-bowel biopsies, might be a promising improvement in clinical workflow efficiency and diagnostic decision making in prescreening of typical colorectal biopsies.
引用
收藏
页码:E786 / E797
页数:12
相关论文
共 30 条
[1]  
American Cancer Society, 2023, KEY STAT COL CANC
[2]  
Bainbridge S., 2016, TESTING TIMES COME E
[3]   Role of AI and digital pathology for colorectal immuno-oncology [J].
Bilal, Mohsin ;
Nimir, Mohammed ;
Snead, David ;
Taylor, Graham S. ;
Rajpoot, Nasir .
BRITISH JOURNAL OF CANCER, 2023, 128 (01) :3-11
[4]   Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study [J].
Bilal, Mohsin ;
Raza, Shan E. Ahmed ;
Azam, Ayesha ;
Graham, Simon ;
Ilyas, Mohammad ;
Cree, Ian A. ;
Snead, David ;
Minhas, Fayyaz ;
Rajpoot, Nasir M. .
LANCET DIGITAL HEALTH, 2021, 3 (12) :E763-E772
[5]  
Bowel Cancer UK, 2023, Bowel cancer
[6]  
Browning L, 2021, J CLIN PATHOL, V74, P443, DOI [10.1136/jclinpath-2020-206854, 10.1136/jclinpath-2020-206786]
[7]   Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge [J].
Bulten, Wouter ;
Kartasalo, Kimmo ;
Chen, Po-Hsuan Cameron ;
Strom, Peter ;
Pinckaers, Hans ;
Nagpal, Kunal ;
Cai, Yuannan ;
Steiner, David F. ;
van Boven, Hester ;
Vink, Robert ;
Hulsbergen-van de Kaa, Christina ;
van der Laak, Jeroen ;
Amin, Mahul B. ;
Evans, Andrew J. ;
van der Kwast, Theodorus ;
Allan, Robert ;
Humphrey, Peter A. ;
Gronberg, Henrik ;
Samaratunga, Hemamali ;
Delahunt, Brett ;
Tsuzuki, Toyonori ;
Hakkinen, Tomi ;
Egevad, Lars ;
Demkin, Maggie ;
Dane, Sohier ;
Tan, Fraser ;
Valkonen, Masi ;
Corrado, Greg S. ;
Peng, Lily ;
Mermel, Craig H. ;
Ruusuvuori, Pekka ;
Litjens, Geert ;
Eklund, Martin .
NATURE MEDICINE, 2022, 28 (01) :154-+
[8]   Clinical-grade computational pathology using weakly supervised deep learning on whole slide images [J].
Campanella, Gabriele ;
Hanna, Matthew G. ;
Geneslaw, Luke ;
Miraflor, Allen ;
Silva, Vitor Werneck Krauss ;
Busam, Klaus J. ;
Brogi, Edi ;
Reuter, Victor E. ;
Klimstra, David S. ;
Fuchs, Thomas J. .
NATURE MEDICINE, 2019, 25 (08) :1301-+
[9]  
Cancer.Net, 2022, Colorectal cancer: diagnosis
[10]  
DeVries T, 2018, Arxiv, DOI arXiv:1802.04865