Research on deep reinforcement learning in Internet of vehicles edge computing based on Quasi-Newton method

被引：0

作者：

Zhang, Jianwu ^{[1
]}

Lu, Zetao ^{[1
]}

Zhang, Qianhua ^{[2
,3
]}

Zhan, Ming ^{[4
]}

机构：

[1] School of Communication Engineering, Hangzhou Dianzi University, Hangzhou

[2] Research Center for Space Computing System, Zhejiang Lab, Hangzhou

[3] College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou

[4] College of Electronic and Information Engineering, Taizhou University, Taizhou

来源：

Tongxin Xuebao/Journal on Communications | 2024年 / 45卷 / 05期

关键词：

deep reinforcement learning; Internet of vehicles; Quasi-Newton method; task offloading;

D O I：

10.11959/j.issn.1000-436x.2024101

中图分类号：

学科分类号：

摘要：

To address the issues of ineffective task offloading decisions caused by multitasking and resource constraints in vehicular networks, the Quasi-Newton method deep reinforcement learning dual-phase online offloading (QNRLO) algorithm was proposed. The algorithm was designed by initially incorporating batch normalization techniques to optimize the training process of deep neural networks. Subsequently, optimization was performed using the Quasi-Newton method to effectively approximate the optimal solution. Through this dual-stage optimization, performance was significantly enhanced under conditions of multitasking and dynamic wireless channels, improving computational efficiency. By introducing Lagrange multipliers and a reconstructed dual function, the non-convex optimization problem was transformed into a convex optimization problem of the dual function, ensuring the global optimality of the algorithm. Additionally, system transmission time allocation in the vehicular network model was considered, enhancing the practicality of the algorithm. Compared to existing algorithms, the proposed algorithm improves the convergence and stability of task offloading significantly, addresses task offloading issues in vehicular networks effectively, and offers high practicality and reliability. © 2024 Editorial Board of Journal on Communications. All rights reserved.

引用

页码：90 / 100

页数：10

共 34 条

[11] ZHANG D J, YU F R, YANG R Z, Et al., Software-defined vehicular networks with trust management: a deep reinforcement learning approach, IEEE Transactions on Intelligent Transportation Systems, 23, 2, pp. 1400-1414, (2022)
[12] GAO H H, HUANG W Q, LIU T, Et al., PPO2: location privacy-oriented task offloading to edge computing using reinforcement learning for intelligent autonomous transport systems, IEEE Transactions on Intelligent Transportation Systems, 24, 7, pp. 7599-7612, (2023)
[13] KIRAN B R, SOBH I, TALPAERT V, Et al., Deep reinforcement learning for autonomous driving: a survey, IEEE Transactions on Intelligent Transportation Systems, 23, 6, pp. 4909-4926, (2022)
[14] WANG X, WANG S, LIANG X X, Et al., Deep reinforcement learning: a survey, IEEE Transactions on Neural Networks and Learning Systems, 35, 4, pp. 5064-5078, (2024)
[15] MENICKELLY M, WILD S M, XIE M., A stochastic quasi-newton method in the absence of common random numbers, (2023)
[16] HONG T, LIU R, LIU Z W, Et al., An asynchronous collision-tolerant ACRDA scheme based on satellite-selection collaborationbeamforming for LEO satellite IoT networks, Sensors, 23, 7, (2023)
[17] KRUTIKOV V, TOVBIS E, BYKOV A, Et al., Properties of the quadratic transformation of dual variables, Algorithms, 16, 3, (2023)
[18] SEGU M, TONIONI A, TOMBARI F., Batch normalization embeddings for deep domain generalization, Pattern Recognition, 135, (2023)
[19] MA Y T, KLABJAN D., Diminishing batch normalization, IEEE Transactions on Neural Networks and Learning Systems, 35, 5, pp. 6544-6557, (2024)
[20] WANG F., Computation rate maximization for wireless powered mobile edge computing, Proceedings of the 2017 23rd Asia-Pacific Conference on Communications (APCC), pp. 1-6, (2017)

← 1 2 3 4 →