OpenFL: the open federated learning library

被引:40
作者
Foley, Patrick [1 ]
Sheller, Micah J. [1 ]
Edwards, Brandon [1 ]
Pati, Sarthak [2 ]
Riviera, Walter [1 ]
Sharma, Mansi [1 ]
Moorthy, Prakash Narayana [1 ]
Wang, Shih-han [1 ]
Martin, Jason [1 ]
Mirhaji, Parsa [3 ]
Shah, Prashant [1 ]
Bakas, Spyridon [2 ]
机构
[1] Intel Corp, Santa Clara, CA 95052 USA
[2] Univ Penn, 3700 Hamilton Walk,Richards Med Res Labs 7th Fl, Philadelphia, PA 19104 USA
[3] Albert Einstein Coll Med, 1300 Morris Pk Ave, Bronx, NY 10461 USA
基金
美国国家卫生研究院;
关键词
federated learning; open-source; security; privacy; machine learning; deep learning;
D O I
10.1088/1361-6560/ac97d9
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective. Federated learning (FL) is a computational paradigm that enables organizations to collaborate on machine learning (ML) and deep learning (DL) projects without sharing sensitive data, such as patient records, financial data, or classified secrets. Approach. Open federated learning (OpenFL) framework is an open-source python-based tool for training ML/DL algorithms using the data-private collaborative learning paradigm of FL, irrespective of the use case. OpenFL works with training pipelines built with both TensorFlow and PyTorch, and can be easily extended to other ML and DL frameworks. Main results. In this manuscript, we present OpenFL and summarize its motivation and development characteristics, with the intention of facilitating its application to existing ML/DL model training in a production environment. We further provide recommendations to secure a federation using trusted execution environments to ensure explicit model security and integrity, as well as maintain data confidentiality. Finally, we describe the first real-world healthcare federations that use the OpenFL library, and highlight how it can be applied to other non-healthcare use cases. Significance. The OpenFL library is designed for real world scalability, trusted execution, and also prioritizes easy migration of centralized ML models into a federated training pipeline. Although OpenFL's initial use case was in healthcare, it is applicable beyond this domain and is now reaching wider adoption both in research and production settings. The tool is open-sourced at github.com/intel/openfl.
引用
收藏
页数:11
相关论文
共 43 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Albarqi Aysha., 2014, Journal of Information Security, V6, P31, DOI [DOI 10.4236/JIS.2015.61004, 10.4236/jis.2015.61004]
  • [3] HIPAA regulations - A new era of medical-record privacy?
    Annas, GJ
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2003, 348 (15) : 1486 - 1490
  • [4] Arjovsky M., 2019, ABS190702893 CORR
  • [5] Baevski A, 2020, Arxiv, DOI arXiv:2006.11477
  • [6] Baid U., 2022, arXiv
  • [7] Baid U, 2021, NEURO-ONCOLOGY, V23, P135
  • [8] Bakas S, 2019, Arxiv, DOI [arXiv:1811.02629, 10.48550/arXiv.1811.02629, DOI 10.48550/ARXIV.1811.02629]
  • [9] Data Descriptor: Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features
    Bakas, Spyridon
    Akbari, Hamed
    Sotiras, Aristeidis
    Bilello, Michel
    Rozycki, Martin
    Kirby, Justin S.
    Freymann, John B.
    Farahani, Keyvan
    Davatzikos, Christos
    [J]. SCIENTIFIC DATA, 2017, 4
  • [10] Bonawitz K, 2019, Arxiv, DOI arXiv:1902.01046