histolab: A Python']Python library for reproducible Digital Pathology preprocessing with automated testing

被引:17
作者
Marcolini, Alessia [1 ]
Bussola, Nicole [2 ,3 ]
Arbitrio, Ernesto [5 ]
Amgad, Mohamed [6 ]
Jurman, Giuseppe [4 ]
Furlanello, Cesare [1 ,3 ]
机构
[1] HK3 Lab, Piazza Manifatture 1, I-38068 Rovereto, Italy
[2] Univ Trento, CIBIO, Via Sommar 9, I-38123 Povo, Italy
[3] Orobix Life, Via G Camozzi 144, I-24121 Bergamo, Italy
[4] Fdn Bruno Kessler, Via Sommar 18, I-38123 Povo, Italy
[5] YouGov PLC, 50 Featherstone St, London EC1Y 8R, England
[6] Northwestern Univ, 750 N Lake Shore Dr, Chicago, IL 60611 USA
关键词
Digital Pathology; Continuous integration; Data preprocessing; Deep Learning; Reproducibility;
D O I
10.1016/j.softx.2022.101237
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep Learning (DL) is rapidly permeating the field of Digital Pathology with algorithms successfully applied to ease daily clinical practice and to discover novel associations. However, most DL workflows for Digital Pathology include custom code for data preprocessing, usually tailored to data and tasks of interest, resulting in software that is error-prone and hard to understand, peer-review, and test. In this work, we introduce histolab, a Python package designed to standardize the preprocessing of Whole Slide Images in a reproducible environment, supported by automated testing. In addition, the package provides functions for building datasets of WSI tiles, including augmentation and morpho-logical operators, a tile scoring framework, and stain normalization methods. histolab is modular, extensible, and easily integrable into DL pipelines, with support of the OpenSlide and large_image backends. To guarantee robustness, histolab embraces software engineering best practices such as multiplatform automated testing and Continuous Integration.(c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页数:6
相关论文
共 57 条
[1]  
Ahasan R, 2016, 2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), P361, DOI 10.1109/ICIEV.2016.7760026
[2]  
Amgad M, 2022, medRxiv
[3]   Deep semantic segmentation of natural and medical images: a review [J].
Asgari Taghanaki, Saeid ;
Abhishek, Kumar ;
Cohen, Joseph Paul ;
Cohen-Adad, Julien ;
Hamarneh, Ghassan .
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (01) :137-178
[4]   QuPath: Open source software for digital pathology image analysis [J].
Bankhead, Peter ;
Loughrey, Maurice B. ;
Fernandez, Jose A. ;
Dombrowski, Yvonne ;
Mcart, Darragh G. ;
Dunne, Philip D. ;
McQuaid, Stephen ;
Gray, Ronan T. ;
Murray, Liam J. ;
Coleman, Helen G. ;
James, Jacqueline A. ;
Salto-Tellez, Manuel ;
Hamilton, Peter W. .
SCIENTIFIC REPORTS, 2017, 7
[5]  
Berman AG, 2021, medRxiv
[6]  
Bussola Nicole, 2021, Pattern Recognition. ICPR 2020 International Workshops and Challenges. Proceedings. Lecture Notes in Computer Science (LNCS 12661), P167, DOI 10.1007/978-3-030-68763-2_13
[7]   Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology [J].
Bussola, Nicole ;
Papa, Bruno ;
Melaiu, Ombretta ;
Castellano, Aurora ;
Fruci, Doriana ;
Jurman, Giuseppe .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (16)
[8]  
Byfield P, 2020, COMPAY SYNTAX
[9]   Pragmatic considerations for fostering reproducible research in artificial intelligence [J].
Carter, Rickey E. ;
Attia, Zachi I. ;
Lopez-Jimenez, Francisco ;
Friedman, Paul A. .
NPJ DIGITAL MEDICINE, 2019, 2 (1)
[10]  
Clark A., 2015, Pillow (pil fork) documentation, DOI DOI 10.5281/ZENODO.8104287