A goal-oriented reinforcement learning for optimal drug dosage control

被引:0
|
作者
Zhang, Qian [1 ]
Li, Tianhao [1 ]
Li, Dengfeng [1 ]
Lu, Wei [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Goal-oriented; Reinforcement learning; Hierarchical decision; Multi-agent; Drug dosage control; SEPTIC SHOCK; SEPSIS; MORTALITY; LEVEL; CARE;
D O I
10.1007/s10479-024-06029-x
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The dosage control of therapeutic drugs is a concern for clinicians. Whether the clinician's dosing decision is correct and efficient determines patient's life. In intensive care units (ICU), medication decision is a dynamic and continuous process, which is difficult to solve by traditional intelligent technologies. while reinforcement learning (RL) has an advantage in handling sequential decision making, it faces challenges in multi-level problems because of the delayed rewards and complex states. Hierarchical reinforcement learning (HRL) is a layered algorithm based on RL. HRL has been proved to be effective in delayed sparse reward issues and reduce the learning difficulty by dividing the long-term goal into stages. Inspired by this, we propose a goal-oriented reinforcement learning (GORL) approach to optimize the drug dosage control for sepsis patients. Specifically, GORL employs two agents to make dosage decisions cooperatively by simulating the behaviors of clinicians. GORL decompose a long-term goal into several short-term goals to reduce the exploration space. In the long-term goal, the concept of the goal-oriented is introduced to solve the sparse reward. A goal-oriented hierarchical structure can help agents to interact and cooperate to achieve the short-term goal. In addition, we design a hindsight intrinsic reward to balance the long-term and short-term goals, and are thus able to learn an optimal policy of drug dosage control. We conduct our experiments on MIMIC-IV, which is one of the biggest medical datasets. The experimental results show that our model outperforms other baseline algorithms and can learn a more robust treatment policy than clinicians, with reducing the patient's mortality by 10.23%.
引用
收藏
页码:1403 / 1423
页数:21
相关论文
共 50 条
  • [41] Goal-oriented research
    Gannon, F
    EMBO REPORTS, 2003, 4 (12) : 1103 - 1103
  • [42] Goal-Oriented Reinforcement Learning in THz-Enabled UAV-Aided Network Using Supervised Learning
    Termehchi, Atefeh
    Bao, Tingnan
    Syed, Aisha
    Kennedy, William Sean
    Erol-Kantarci, Melike
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2024, 5 : 5027 - 5036
  • [43] GOAL-ORIENTED ETHOLOGY
    FRASER, AF
    ZOOTECHNIA, 1978, 27 (4-6): : 127 - 131
  • [44] Goal-Oriented Authoring Approach and Design of Learning Systems
    Emin, Valerie
    Pernin, Jean-Philippe
    Gueraud, Viviane
    ADVANCES IN CONCEPTUAL MODELING - CHALLENGES AND OPPORTUNITIES, 2008, 5232 : 292 - 301
  • [45] Goal-Oriented Sensitivity Analysis of Hyperparameters in Deep Learning
    Novello, Paul
    Poette, Gael
    Lugato, David
    Congedo, Pietro Marco
    JOURNAL OF SCIENTIFIC COMPUTING, 2023, 94 (03)
  • [46] Goal-Oriented Dialogue Policy Learning from Failures
    Lu, Keting
    Zhang, Shiqi
    Chen, Xiaoping
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2596 - 2603
  • [47] Goal-Oriented Sensitivity Analysis of Hyperparameters in Deep Learning
    Paul Novello
    Gaël Poëtte
    David Lugato
    Pietro Marco Congedo
    Journal of Scientific Computing, 2023, 94
  • [48] ScenEdit: a goal-oriented tool to design learning scenarios
    Emin, Valerie
    Pernin, Jean-Philippe
    ICALT: 2009 IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, 2009, : 736 - 737
  • [49] Goal-Oriented Control with Brain-Computer Interface
    Edlinger, Guenter
    Holzner, Clemens
    Groenegress, Christoph
    Guger, Christoph
    Slater, Mel
    FOUNDATIONS OF AUGMENTED COGNITION, PROCEEDINGS: NEUROERGONOMICS AND OPERATIONAL NEUROSCIENCE, 2009, 5638 : 732 - +
  • [50] Goal-oriented optimal subset selection of correlated multimedia streams
    Atrey, Pradeep K.
    Kankanhalli, Mohan S.
    Oommen, John B.
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2007, 3 (01)