Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

被引：0

作者：

Lv, Wenzhou ^{[1
]}

Wu, Tianyu ^{[1
]}

Xiong, Luolin ^{[1
]}

Wu, Liang ^{[2
,3
]}

Zhou, Jian ^{[2
,3
]}

Tang, Yang ^{[1
]}

Qian, Feng ^{[1
]}

机构：

[1] East China Univ Sci & Technol, State Key Lab Ind Control Technol, Shanghai 200237, Peoples R China

[2] Shanghai Jiao Tong Univ, Metab, Shanghai, Peoples R China

[3] Shanghai Diabet Inst, Shanghai Clin Ctr Diabet, Peoples Hosp 6, Shanghai, Peoples R China

来源：

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING | 2025年 / 72卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Glucose; Insulin; Safety; Uncertainty; Pancreas; Metalearning; Accuracy; Artificial pancreas; glucose control; diabetes; reinforcement learning; meta learning; TYPE-1; MPC; SAFETY;

D O I：

10.1109/TBME.2024.3451712

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Objective: The artificial pancreas (AP) shows promise for closed-loop glucose control in type 1 diabetes mellitus (T1DM). However, designing effective control policies for the AP remains challenging due to complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but struggles with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the UVA/Padova T1DM simulator across five scenarios. Our approaches achieve the highest percentage of time spent in the desired range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming their great potential for efficient closed-loop glucose control.

引用

页码：309 / 323

页数：15

共 50 条

[31] Intelligent Traffic Light via Policy-based Deep Reinforcement Learning
Zhu, Yue
Cai, Mingyu
Schwarz, Chris W.
Li, Junchao
Xiao, Shaoping
INTERNATIONAL JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS RESEARCH, 2022, 20 (03) : 734 - 744
[32] Ensemble-based Deep Reinforcement Learning for robust cooperative wind farm control
He, Binghao
Zhao, Huan
Liang, Gaoqi
Zhao, Junhua
Qiu, Jing
Dong, Zhao Yang
International Journal of Electrical Power and Energy Systems, 2022, 143
[33] Accelerating deep reinforcement learning via knowledge-guided policy network
Yuanqiang Yu
Peng Zhang
Kai Zhao
Yan Zheng
Jianye Hao
Autonomous Agents and Multi-Agent Systems, 2023, 37
[34] Efficient congestion control in communications using novel weighted ensemble deep reinforcement learning
Ali, Majid Hamid
Ozturk, Serkan
COMPUTERS & ELECTRICAL ENGINEERING, 2023, 110
[35] Efficient Distributed Energy Resource Voltage Control Using Ensemble Deep Reinforcement Learning
Obert, James
Trevizan, Rodrigo D.
Chavez, Adrian
INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2023, 17 (02) : 293 - 308
[36] Intelligent Traffic Light via Policy-based Deep Reinforcement Learning
Yue Zhu
Mingyu Cai
Chris W. Schwarz
Junchao Li
Shaoping Xiao
International Journal of Intelligent Transportation Systems Research, 2022, 20 : 734 - 744
[37] Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning
Ha, Jung-Su
Park, Young-Jin
Chae, Hyeok-Joo
Park, Soon-Seo
Choi, Han-Lim
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4459 - 4466
[38] Ensemble-based Deep Reinforcement Learning for robust cooperative wind farm control
He, Binghao
Zhao, Huan
Liang, Gaoqi
Zhao, Junhua
Qiu, Jing
Dong, Zhao Yang
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 143
[39] Ensemble-based deep reinforcement learning for chatbots
Cuayahuitl, Heriberto
Lee, Donghyeon
Ryu, Seonghan
Cho, Yongjin
Choi, Sungja
Indurthi, Satish
Yu, Seunghak
Choi, Hyungtak
Hwang, Inchul
Kim, Jihie
NEUROCOMPUTING, 2019, 366 : 118 - 130
[40] A new hybrid ensemble deep reinforcement learning model for wind speed short term forecasting
Liu, Hui
Yu, Chengqing
Wu, Haiping
Duan, Zhu
Yan, Guangxi
ENERGY, 2020, 202

← 1 2 3 4 5 →