Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

被引:0
|
作者
Lv, Wenzhou [1 ]
Wu, Tianyu [1 ]
Xiong, Luolin [1 ]
Wu, Liang [2 ,3 ]
Zhou, Jian [2 ,3 ]
Tang, Yang [1 ]
Qian, Feng [1 ]
机构
[1] East China Univ Sci & Technol, State Key Lab Ind Control Technol, Shanghai 200237, Peoples R China
[2] Shanghai Jiao Tong Univ, Metab, Shanghai, Peoples R China
[3] Shanghai Diabet Inst, Shanghai Clin Ctr Diabet, Peoples Hosp 6, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Glucose; Insulin; Safety; Uncertainty; Pancreas; Metalearning; Accuracy; Artificial pancreas; glucose control; diabetes; reinforcement learning; meta learning; TYPE-1; MPC; SAFETY;
D O I
10.1109/TBME.2024.3451712
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: The artificial pancreas (AP) shows promise for closed-loop glucose control in type 1 diabetes mellitus (T1DM). However, designing effective control policies for the AP remains challenging due to complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but struggles with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the UVA/Padova T1DM simulator across five scenarios. Our approaches achieve the highest percentage of time spent in the desired range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming their great potential for efficient closed-loop glucose control.
引用
收藏
页码:309 / 323
页数:15
相关论文
共 50 条
  • [1] Policy ensemble gradient for continuous control problems in deep reinforcement learning
    Liu, Guoqiang
    Chen, Gang
    Huang, Victoria
    NEUROCOMPUTING, 2023, 548
  • [2] Hybrid LMC: Hybrid Learning and Model-based Control for Wheeled Humanoid Robot via Ensemble Deep Reinforcement Learning
    Baek, Donghoon
    Purushottam, Amartya
    Ramos, Joao
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 9347 - 9354
  • [3] Artificial Pancreas Control for Diabetes using TD3 Deep Reinforcement Learning
    Mackey, Alan
    Furey, Eoghan
    2022 33RD IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2022,
  • [4] Continuous control of structural vibrations using hybrid deep reinforcement learning policy
    Panda, Jagajyoti
    Chopra, Mudit
    Matsagar, Vasant
    Chakraborty, Souvik
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
  • [5] Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
    Wu, Junta
    Li, Huiyun
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020 (2020)
  • [6] Optimizing Policy via Deep Reinforcement Learning for Dialogue Management
    Xu, Guanghao
    Lee, Hyunjung
    Koo, Myoung-Wan
    Seo, Jungyun
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 582 - 589
  • [7] Efficient Deep Reinforcement Learning via Adaptive Policy Transfer
    Yang, Tianpei
    Hao, Jianye
    Meng, Zhaopeng
    Zhang, Zongzhang
    Hu, Yujing
    Chen, Yingfeng
    Fan, Changjie
    Wang, Weixun
    Liu, Wulong
    Wang, Zhaodong
    Peng, Jiajie
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3094 - 3100
  • [8] Dynamic modeling and control of pneumatic artificial muscles via Deep Lagrangian Networks and Reinforcement Learning
    Wang, Shuopeng
    Wang, Rixin
    Liu, Yanhui
    Zhang, Ying
    Hao, Lina
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148
  • [9] Efficient DER Voltage Control Using Ensemble Deep Reinforcement Learning
    Obert, James
    Trevizan, Rodrigo D.
    Chavez, Adrian
    2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 55 - 58
  • [10] SATELLITE FORMATION CONTROL VIA DEEP REINFORCEMENT LEARNING
    Broida, Jacob
    Linares, Richard
    FIRST IAA/AAS SCITECH FORUM ON SPACE FLIGHT MECHANICS AND SPACE STRUCTURES AND MATERIALS, 2020, 170 : 343 - 352