An Adaptive Metadata Management Scheme Based on Deep Reinforcement Learning for Large-Scale Distributed File Systems

被引：0

作者：

Huang, Xiuqi ^{[1
]}

Gao, Yuanning ^{[1
]}

Zhou, Xinyi ^{[1
]}

Gao, Xiaofeng ^{[1
]}

Chen, Guihai ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China

来源：

IEEE-ACM TRANSACTIONS ON NETWORKING | 2023年 / 31卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Metadata; Servers; File systems; Scalability; Indexes; Costs; Resource management; Metadata management; deep reinforcement learning; DDPG; distributed file system; EFFICIENT; POLICY;

D O I：

10.1109/TNET.2023.3266400

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A major challenge confronting today's distributed metadata management schemes is how to meet the dynamic requirements of various applications through effectively mapping and migrating metadata nodes to different metadata servers (MDS's). Most of the existing works dynamically reallocate nodes to different servers adopting history-based coarse-grained methods, failing to make a timely and efficient update on the distribution of nodes. In this paper, we present the first deep reinforcement learning-leveraged distributed metadata management scheme, AdaM, to address the aforementioned dilemma. AdaM is an adaptive fine-grained metadata management scheme that trains an actor-critic network to migrate "hot" metadata nodes to different MDS's based on its observations of the current "states" (i.e., access pattern, the structure of namespace tree and current distribution of nodes on MDS's). Adaptive to varying access patterns, AdaM can automatically migrate hot metadata nodes among servers to keep load balancing while maintaining metadata locality. Besides, we propose a self-adaptive metadata cache policy, which dynamically combines the two strategies of managing caches on the server side and the client side to gain better query performance. Last but not least, we design a distributed metadata processing 2PC Protocol called MST-based 2PC to ensure data consistency. Experiments on a real-world dataset demonstrate the superiority of our proposed method over other schemes.

引用

页码：2840 / 2853

页数：14

共 46 条

[1] Serverless network file systems [J].