Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

被引：0

作者：

Lv, Wenzhou ^{[1
]}

Wu, Tianyu ^{[1
]}

Xiong, Luolin ^{[1
]}

Wu, Liang ^{[2
,3
]}

Zhou, Jian ^{[2
,3
]}

Tang, Yang ^{[1
]}

Qian, Feng ^{[1
]}

机构：

[1] East China Univ Sci & Technol, State Key Lab Ind Control Technol, Shanghai 200237, Peoples R China

[2] Shanghai Jiao Tong Univ, Metab, Shanghai, Peoples R China

[3] Shanghai Diabet Inst, Shanghai Clin Ctr Diabet, Peoples Hosp 6, Shanghai, Peoples R China

来源：

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING | 2025年 / 72卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Glucose; Insulin; Safety; Uncertainty; Pancreas; Metalearning; Accuracy; Artificial pancreas; glucose control; diabetes; reinforcement learning; meta learning; TYPE-1; MPC; SAFETY;

D O I：

10.1109/TBME.2024.3451712

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Objective: The artificial pancreas (AP) shows promise for closed-loop glucose control in type 1 diabetes mellitus (T1DM). However, designing effective control policies for the AP remains challenging due to complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but struggles with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the UVA/Padova T1DM simulator across five scenarios. Our approaches achieve the highest percentage of time spent in the desired range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming their great potential for efficient closed-loop glucose control.

引用

页码：309 / 323

页数：15

共 50 条

[1] Policy ensemble gradient for continuous control problems in deep reinforcement learning
Liu, Guoqiang
Chen, Gang
Huang, Victoria
NEUROCOMPUTING, 2023, 548
[2] Hybrid LMC: Hybrid Learning and Model-based Control for Wheeled Humanoid Robot via Ensemble Deep Reinforcement Learning
Baek, Donghoon
Purushottam, Amartya
Ramos, Joao
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 9347 - 9354
[3] Artificial Pancreas Control for Diabetes using TD3 Deep Reinforcement Learning
Mackey, Alan
Furey, Eoghan
2022 33RD IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2022,
[4] Continuous control of structural vibrations using hybrid deep reinforcement learning policy
Panda, Jagajyoti
Chopra, Mudit
Matsagar, Vasant
Chakraborty, Souvik
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
[5] Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
Wu, Junta
Li, Huiyun
MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020 (2020)
[6] Optimizing Policy via Deep Reinforcement Learning for Dialogue Management
Xu, Guanghao
Lee, Hyunjung
Koo, Myoung-Wan
Seo, Jungyun
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 582 - 589
[7] Efficient Deep Reinforcement Learning via Adaptive Policy Transfer
Yang, Tianpei
Hao, Jianye
Meng, Zhaopeng
Zhang, Zongzhang
Hu, Yujing
Chen, Yingfeng
Fan, Changjie
Wang, Weixun
Liu, Wulong
Wang, Zhaodong
Peng, Jiajie
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3094 - 3100
[8] Dynamic modeling and control of pneumatic artificial muscles via Deep Lagrangian Networks and Reinforcement Learning
Wang, Shuopeng
Wang, Rixin
Liu, Yanhui
Zhang, Ying
Hao, Lina
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148
[9] Efficient DER Voltage Control Using Ensemble Deep Reinforcement Learning
Obert, James
Trevizan, Rodrigo D.
Chavez, Adrian
2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 55 - 58
[10] SATELLITE FORMATION CONTROL VIA DEEP REINFORCEMENT LEARNING
Broida, Jacob
Linares, Richard
FIRST IAA/AAS SCITECH FORUM ON SPACE FLIGHT MECHANICS AND SPACE STRUCTURES AND MATERIALS, 2020, 170 : 343 - 352

← 1 2 3 4 5 →