PPML-Omics: A privacy-preserving federated machine learning method protects patients' privacy in omic data

被引:19
作者
Zhou, Juexiao [1 ,2 ]
Chen, Siyuan [1 ,2 ]
Wu, Yulian [1 ,2 ]
Li, Haoyang [1 ,2 ]
Zhang, Bin [1 ,2 ]
Zhou, Longxi [1 ,2 ]
Hu, Yan [1 ]
Xiang, Zihang [1 ]
Li, Zhongxiao [1 ,2 ]
Chen, Ningning [1 ,2 ]
Han, Wenkai [1 ,2 ]
Xu, Chencheng [1 ,2 ]
Wang, Di [1 ,2 ]
Gao, Xin [1 ,2 ]
机构
[1] King Abdullah Univ Sci & Technol KAUST, Comp Sci Program, Comp Elect & Math Sci & Engn Div, Thuwal 239556900, Saudi Arabia
[2] King Abdullah Univ Sci & Technol KAUST, Computat Biosci Res Ctr, Comp Elect & Math Sci & Engn Div, Thuwal 239556900, Saudi Arabia
关键词
DIFFERENTIAL PRIVACY; RNA-SEQ; TECHNOLOGIES; LEAKAGE;
D O I
10.1126/sciadv.adh8601
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Modern machine learning models toward various tasks with omic data analysis give rise to threats of privacy leakage of patients involved in those datasets. Here, we proposed a secure and privacy-preserving machine learning method (PPML-Omics) by designing a decentralized differential private federated learning algorithm. We applied PPML-Omics to analyze data from three sequencing technologies and addressed the privacy concern in three major tasks of omic data under three representative deep learning models. We examined privacy breaches in depth through privacy attack experiments and demonstrated that PPML-Omics could protect patients' privacy. In each of these applications, PPML-Omics was able to outperform methods of comparison under the same level of privacy guarantee, demonstrating the versatility of the method in simultaneously balancing the privacy-preserving capability and utility in omic data analysis. Furthermore, we gave the theoretical proof of the privacy-preserving capability of PPML-Omics, suggesting the first mathematically guaranteed method with robust and generalizable empirical performance in protecting patients' privacy in omic data.
引用
收藏
页数:18
相关论文
共 107 条
[1]   Federated learning and differential privacy for medical image analysis [J].
Adnan, Mohammed ;
Kalra, Shivam ;
Cresswell, Jesse C. ;
Taylor, Graham W. ;
Tizhoosh, Hamid R. .
SCIENTIFIC REPORTS, 2022, 12 (01)
[2]   Privacy-Preserving Machine Learning: Threats and Solutions [J].
Al-Rubaie, Mohammad ;
Chang, J. Morris .
IEEE SECURITY & PRIVACY, 2019, 17 (02) :49-58
[3]   Integration of blockchain and federated learning for Internet of Things: Recent advances and future challenges [J].
Ali, Mansoor ;
Karimipour, Hadis ;
Tariq, Muhammad .
COMPUTERS & SECURITY, 2021, 108
[4]   Differential privacy under dependent tuples-the case of genomic privacy [J].
Almadhoun, Nour ;
Ayday, Erman ;
Ulusoy, Ozgur .
BIOINFORMATICS, 2020, 36 (06) :1696-1703
[5]   GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping [J].
Alser, Mohammed ;
Hassan, Hasan ;
Xin, Hongyi ;
Ergin, Oguz ;
Mutlu, Onur ;
Alkan, Can .
BIOINFORMATICS, 2017, 33 (21) :3355-3363
[6]   Siloed Federated Learning for Multi-centric Histopathology Datasets [J].
Andreux, Mathieu ;
du Terrail, Jean Ogier ;
Beguier, Constance ;
Tramel, Eric W. .
DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 :129-139
[7]   Machine learning and genomics: precision medicine versus patient privacy [J].
Azencott, C. -A. .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2018, 376 (2128)
[8]  
Balle B, 2019, INT C MACHINE LEAMIN, P394
[9]   The Privacy Blanket of the Shuffle Model [J].
Balle, Borja ;
Bell, James ;
Gascon, Adria ;
Nissim, Kobbi .
ADVANCES IN CRYPTOLOGY - CRYPTO 2019, PT II, 2019, 11693 :638-667
[10]  
Benaissa A, 2021, Arxiv, DOI arXiv:2104.03152