Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

被引:0
|
作者
Lv, Wenzhou [1 ]
Wu, Tianyu [1 ]
Xiong, Luolin [1 ]
Wu, Liang [2 ,3 ]
Zhou, Jian [2 ,3 ]
Tang, Yang [1 ]
Qian, Feng [1 ]
机构
[1] East China Univ Sci & Technol, State Key Lab Ind Control Technol, Shanghai 200237, Peoples R China
[2] Shanghai Jiao Tong Univ, Metab, Shanghai, Peoples R China
[3] Shanghai Diabet Inst, Shanghai Clin Ctr Diabet, Peoples Hosp 6, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Glucose; Insulin; Safety; Uncertainty; Pancreas; Metalearning; Accuracy; Artificial pancreas; glucose control; diabetes; reinforcement learning; meta learning; TYPE-1; MPC; SAFETY;
D O I
10.1109/TBME.2024.3451712
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: The artificial pancreas (AP) shows promise for closed-loop glucose control in type 1 diabetes mellitus (T1DM). However, designing effective control policies for the AP remains challenging due to complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but struggles with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the UVA/Padova T1DM simulator across five scenarios. Our approaches achieve the highest percentage of time spent in the desired range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming their great potential for efficient closed-loop glucose control.
引用
收藏
页码:309 / 323
页数:15
相关论文
共 50 条
  • [31] Intelligent Traffic Light via Policy-based Deep Reinforcement Learning
    Zhu, Yue
    Cai, Mingyu
    Schwarz, Chris W.
    Li, Junchao
    Xiao, Shaoping
    INTERNATIONAL JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS RESEARCH, 2022, 20 (03) : 734 - 744
  • [32] Ensemble-based Deep Reinforcement Learning for robust cooperative wind farm control
    He, Binghao
    Zhao, Huan
    Liang, Gaoqi
    Zhao, Junhua
    Qiu, Jing
    Dong, Zhao Yang
    International Journal of Electrical Power and Energy Systems, 2022, 143
  • [33] Accelerating deep reinforcement learning via knowledge-guided policy network
    Yuanqiang Yu
    Peng Zhang
    Kai Zhao
    Yan Zheng
    Jianye Hao
    Autonomous Agents and Multi-Agent Systems, 2023, 37
  • [34] Efficient congestion control in communications using novel weighted ensemble deep reinforcement learning
    Ali, Majid Hamid
    Ozturk, Serkan
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 110
  • [35] Efficient Distributed Energy Resource Voltage Control Using Ensemble Deep Reinforcement Learning
    Obert, James
    Trevizan, Rodrigo D.
    Chavez, Adrian
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2023, 17 (02) : 293 - 308
  • [36] Intelligent Traffic Light via Policy-based Deep Reinforcement Learning
    Yue Zhu
    Mingyu Cai
    Chris W. Schwarz
    Junchao Li
    Shaoping Xiao
    International Journal of Intelligent Transportation Systems Research, 2022, 20 : 734 - 744
  • [37] Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning
    Ha, Jung-Su
    Park, Young-Jin
    Chae, Hyeok-Joo
    Park, Soon-Seo
    Choi, Han-Lim
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4459 - 4466
  • [38] Ensemble-based Deep Reinforcement Learning for robust cooperative wind farm control
    He, Binghao
    Zhao, Huan
    Liang, Gaoqi
    Zhao, Junhua
    Qiu, Jing
    Dong, Zhao Yang
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 143
  • [39] Ensemble-based deep reinforcement learning for chatbots
    Cuayahuitl, Heriberto
    Lee, Donghyeon
    Ryu, Seonghan
    Cho, Yongjin
    Choi, Sungja
    Indurthi, Satish
    Yu, Seunghak
    Choi, Hyungtak
    Hwang, Inchul
    Kim, Jihie
    NEUROCOMPUTING, 2019, 366 : 118 - 130
  • [40] A new hybrid ensemble deep reinforcement learning model for wind speed short term forecasting
    Liu, Hui
    Yu, Chengqing
    Wu, Haiping
    Duan, Zhu
    Yan, Guangxi
    ENERGY, 2020, 202