Optimal policy learning for COVID-19 prevention using reinforcement learning

被引：18

作者：

Uddin, M. Irfan ^{[1
]}

Ali Shah, Syed Atif ^{[2
,3
]}

Al-Khasawneh, Mahmoud Ahmad ^{[3
]}

Alarood, Ala Abdulsalam ^{[4
]}

Alsolami, Eesa ^{[5
]}

机构：

[1] Kohat Univ Sci & Technol, Inst Comp, Kohat 26000, Pakistan

[2] Northern Univ, Fac Engn & Informat Technol, Khyber Pakhtunkhwa, Pakistan

[3] Al Madinah Int Univ, Fac Comp & Informat Technol, Kuala Lumpur, Malaysia

[4] Univ Jeddah, Coll Comp Sci & Engn, Jeddah, Saudi Arabia

[5] Univ Jeddah, Coll Comp Sci & Engn, Dept Cyber Secur, Jeddah, Saudi Arabia

来源：

JOURNAL OF INFORMATION SCIENCE | 2022年 / 48卷 / 03期

关键词：

Reinforcement learning; COVID-19; prevention; policy learning;

D O I：

10.1177/0165551520959798

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

COVID-19 has changed the lifestyle of many people due to its rapid human-to-human transmission. The spread started at the end of January 2020, and different countries used different approaches in terms of testing, sanitization, lock down and quarantine centres to control the spread of the virus. People are getting back to working and routine life activities with new normal standards of testing, sanitization, social distancing and lock down. People are regularly tested to identify those who are infected with COVID-19 and isolate them from general public. However, testing all people unnecessarily is an expensive operation in terms of resources usage. There must be an optimal policy to test only those who have higher chances of being COVID-19 positive. Similarly, sanitization is used for individuals and streets to disinfect people and places. However, sanitization is also an expensive operation in terms of resources, and it is not possible to disinfect each and every individual and street. Social separating or lock down or quarantine centres focuses are different methodologies that are utilised to control the human-to-human transmission of the infection and separate the individuals who are contaminated with COVID-19. However, lock down and quarantine centres are expensive operations in terms of resources as it disturbs the affairs of state and the growth of economy. At the same time, it negatively affects the quality of life of a society. It is also not possible to provide resources to all citizens by locking them inside homes or quarantine centres for infinite time. All these parameters are expensive in terms of resources and have an effect on controlling the spread of the virus, quality of life of human, resources and economy. In this article, a novel intelligent method based on reinforcement learning (RL) is built up that quantifies the unique levels of testing, disinfection and lock down alongside its impact on the spread of the infection, personal satisfaction or quality of life, resource use and economy. Different RL algorithms are actualized and agents are prepared with these algorithms to interact with the environment to gain proficiency with the best strategy. The examinations exhibit that deep learning-based algorithms, for example, DQN and DDPG are performing better than customary RL algorithms, for example, Q-Learning and SARSA.

引用

页码：336 / 348

页数：13

共 44 条

[1] An Enhanced Deep Neural Network for Predicting Workplace Absenteeism
Ali Shah, Syed Atif
Uddin, Irfan
Aziz, Furqan
Ahmad, Shafiq
Al-Khasawneh, Mahmoud Ahmad
Sharaf, Mohamed
[J]. COMPLEXITY, 2020, 2020
[2] Recurrent Neural Networks With TF-IDF Embedding Technique for Detection and Classification in Tweets of Dengue Disease
Amin, Samina
Uddin, M. Irfan
Hassan, Saima
Khan, Atif
Nasser, Nidal
Alharbi, Abdullah
Alyami, Hashem
[J]. IEEE ACCESS, 2020, 8 : 131522 - 131533
[3] [Anonymous], 2018, REINFORCEMENT LEARNI
[4] [Anonymous], 2018, INTRO DEEP REINFORCE
[5] Becky M., 2020, LANCET DIGIT HEALTH, V2, p4e166
[6] CHRISTOPHER WATKINS, 1989, THESIS
[7] Speech Technology Progress Based on New Machine Learning Paradigm
Delic, Vlado
Peric, Zoran
Secujski, Milan
Jakovuevic, Niksa
Nikolic, Jelena
Miskovic, Dragisa
Simic, Nikola
Suzic, Sinisa
Delic, Tijana
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2019, 2019
[8] Dongbin Z, 2016, IEEE S SERIES COMPUT
[9] Erwin L., 2018, BMJ LEADER, V2, P259
[10] Fei J, 2017, STROKE VASC NEUROL, V2, P4230

← 1 2 3 4 5 →