On-Chain and Off-Chain Data Management for Blockchain-Internet of Things: A Multi-Agent Deep Reinforcement Learning Approach

被引:11
作者
Tsang, Y. P. [1 ]
Lee, C. K. M. [1 ]
Zhang, Kening [1 ]
Wu, C. H. [2 ]
Ip, W. H. [2 ,3 ]
机构
[1] Hong Kong Polytech Univ, Res Inst Adv Mfg, Dept Ind & Syst Engn, Hung Hom,Kowloon, Hong Kong, Peoples R China
[2] Hang Seng Univ Hong Kong, Dept Supply Chain & Informat Management, Shatin, Hong Kong, Peoples R China
[3] Univ Saskatchewan, Dept Mech Engn, Saskatoon, SK, Canada
关键词
Blockchain; Internet of Things; Data management; Deep reinforcement learning; Asynchronous advantage actor-critic (A3C) algorithm; PREDICTION; STORAGE;
D O I
10.1007/s10723-023-09739-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of blockchain technology has seen applications increasingly hybridise cloud storage and distributed ledger technology in the Internet of Things (IoT) and cyber-physical systems, complicating data management in decentralised applications (DApps). Because it is inefficient for blockchain technology to handle large amounts of data, effective on-chain and off-chain data management in peer-to-peer networks and cloud storage has drawn considerable attention. Space reservation is a cost-effective approach to managing cloud storage effectively, contrasting with the demand for additional space in real-time. Furthermore, off-chain data replication in the peer-to-peer network can eliminate single points of failure of DApps. However, recent research has rarely discussed optimising on-chain and off-chain data management in the blockchain-enabled IoT (BIoT) environment. In this study, the BIoT environment is modelled, with cloud storage and blockchain orchestrated over the peer-to-peer network. The asynchronous advantage actor-critic algorithm is applied to exploit intelligent agents with the optimal policy for data packing, space reservation, and data replication to achieve an intelligent data management strategy. The experimental analysis reveals that the proposed scheme demonstrates rapid convergence and superior performance in terms of average total reward compared with other typical schemes, resulting in enhanced scalability, security and reliability of blockchain-IoT networks, leading to an intelligent data management strategy.
引用
收藏
页数:22
相关论文
共 49 条
  • [1] Abadi M, 2016, arXiv, DOI DOI 10.48550/ARXIV.1603.04467
  • [2] Relay Selection and Resource Allocation for Multi-User Cooperative OFDMA Networks
    Alam, Md Shamsul
    Mark, Jon W.
    Shen, Xuemin
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2013, 12 (05) : 2193 - 2205
  • [3] [Anonymous], 2016, Antshares digital assets for everyone
  • [4] Babaeizadeh M, 2017, Arxiv, DOI arXiv:1611.06256
  • [5] Prediction-based proactive load balancing approach through VM migration
    Bala, Anju
    Chana, Inderveer
    [J]. ENGINEERING WITH COMPUTERS, 2016, 32 (04) : 581 - 592
  • [6] Bein D, 2011, STUD COMPUT INTELL, V382, P63
  • [7] Privacy reinforcement learning for faults detection in the smart grid
    Belhadi, Asma
    Djenouri, Youcef
    Srivastava, Gautam
    Jolfaei, Alireza
    Lin, Jerry Chun-Wei
    [J]. AD HOC NETWORKS, 2021, 119
  • [8] Ben-Yair, 2020, Updating google photos' storage policy to build for the future
  • [9] Breslau L, 1999, IEEE INFOCOM SER, P126, DOI 10.1109/INFCOM.1999.749260
  • [10] Processor design for portable systems
    Burd, TD
    Brodersen, RW
    [J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1996, 13 (2-3): : 203 - 221