Federated learning and differential privacy for medical image analysis

被引:185
作者
Adnan, Mohammed [1 ,2 ,4 ]
Kalra, Shivam [1 ,2 ]
Cresswell, Jesse C. [3 ]
Taylor, Graham W. [2 ,4 ]
Tizhoosh, Hamid R. [1 ,2 ,5 ]
机构
[1] Univ Waterloo, Kimia Lab, Waterloo, ON, Canada
[2] MaRS Discovery Dist, Vector Inst, Toronto, ON, Canada
[3] MaRS Discovery Dist, Layer AI 6, Toronto, ON, Canada
[4] Univ Guelph, Guelph, ON, Canada
[5] Mayo Clin, Artificial Intelligence & Informat, Rochester, MN 55902 USA
基金
加拿大自然科学与工程研究理事会;
关键词
NOISE;
D O I
10.1038/s41598-022-05539-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The artificial intelligence revolution has been spurred forward by the availability of large-scale datasets. In contrast, the paucity of large-scale medical datasets hinders the application of machine learning in healthcare. The lack of publicly available multi-centric and diverse datasets mainly stems from confidentiality and privacy concerns around sharing medical data. To demonstrate a feasible path forward in medical image imaging, we conduct a case study of applying a differentially private federated learning framework for analysis of histopathology images, the largest and perhaps most complex medical images. We study the effects of IID and non-IID distributions along with the number of healthcare providers, i.e., hospitals and clinics, and the individual dataset sizes, using The Cancer Genome Atlas (TCGA) dataset, a public repository, to simulate a distributed environment. We empirically compare the performance of private, distributed training to conventional training and demonstrate that distributed training can achieve similar performance with strong privacy guarantees. We also study the effect of different source domains for histopathology images by evaluating the performance using external validation. Our work indicates that differentially private federated learning is a viable and reliable framework for the collaborative development of machine learning models in medical image analysis.
引用
收藏
页数:10
相关论文
共 40 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]   Representation Learning of Histopathology Images using Graph Neural Networks [J].
Adnan, Mohammed ;
Kalra, Shivam ;
Tizhoosh, Hamid R. .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :4254-4261
[3]   Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis [J].
Aggarwal, Ravi ;
Sounderajah, Viknesh ;
Martin, Guy ;
Ting, Daniel S. W. ;
Karthikesalingam, Alan ;
King, Dominic ;
Ashrafian, Hutan ;
Darzi, Ara .
NPJ DIGITAL MEDICINE, 2021, 4 (01)
[4]  
Andrychowicz M, 2016, ADV NEUR IN, V29
[5]  
Bhowmick A., 2018, Protection against reconstruction and its applications in private federated learning
[6]   Multiple instance learning: A survey of problem characteristics and applications [J].
Carbonneau, Marc-Andre ;
Cheplygina, Veronika ;
Granger, Eric ;
Gagnon, Ghyslain .
PATTERN RECOGNITION, 2018, 77 :329-353
[7]   Distributed deep learning networks among institutions for medical imaging [J].
Chang, Ken ;
Balachandar, Niranjan ;
Lam, Carson ;
Yi, Darvin ;
Brown, James ;
Beers, Andrew ;
Rosen, Bruce ;
Rubin, Daniel L. ;
Kalpathy-Cramer, Jayashree .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2018, 25 (08) :945-954
[8]  
Claici S., 2020, PR MACH LEARN RES
[9]  
Dwork C, 2006, LECT NOTES COMPUT SC, V4004, P486
[10]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284