Policy Gradient Reinforcement Learning for I/O Reordering on Storage Servers

被引:0
作者
Dheenadayalan, Kumar [1 ]
Srinivasaraghavan, Gopalakrishnan [1 ]
Muralidhara, V. N. [1 ]
机构
[1] Int Inst Informat Technol, Bangalore, Bengaluru, India
来源
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I | 2017年 / 10634卷
关键词
Policy gradient; Filer; I/O reordering; Overload; Throughput;
D O I
10.1007/978-3-319-70087-8_87
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep customization of storage architectures to the applications they support is often undesirable - nature of application data is dynamic, applications are replaced far more often than storage systems are and usage patterns change dynamically with time. A continuously learning software intervention that dynamically adapts to the changing workload pattern would be the easiest way to bridge this 'gap'. As borne out by our experiments, the overhead induced by such software interventions turns out to be negligible for large-scale storage systems. Reinforcement Learning offers a way to dynamically learn from a continuous data stream and take appropriate actions towards optimizing a future goal. We adapt policy gradient reinforcement learning to learn a policy that minimizes I/O wait time that in turn maximizes I/O throughput. A set of discrete actions consisting of switches between scheduling schemes is considered to dynamically re-order client-specific I/O operations. Results reveal that I/O reordering policy learned using reinforcement learning results in significant improvement in the overall I/O throughput.
引用
收藏
页码:849 / 859
页数:11
相关论文
共 18 条
[1]  
[Anonymous], 2013, P ADV NEUR INF PROC
[2]  
Deshpande S, 2016, 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), P13, DOI [10.1109/ICMLA.2016.0012, 10.1109/ICMLA.2016.99]
[3]   Self-tuning Filers - Overload Prediction and Preventive Tuning Using Pruned Random Forest [J].
Dheenadayalan, Kumar ;
Srinivasaraghavan, Gopalakrishnan ;
Muralidhara, V. N. .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT II, 2017, 10235 :495-507
[4]  
Glorot X, 2010, P 13 INT C ART INT S, P249
[5]   Self-optimizing memory controllers:: A reinforcement learning approach [J].
Ipek, Engin ;
Mutlu, Onur ;
Martinez, Jose F. ;
Caruana, Rich .
ISCA 2008 PROCEEDINGS: 35TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2008, :39-+
[6]  
Kerrisk Michael, 2010, The Linux Programming Interface a Linux und UNIX System Programming Handbook
[7]   Reinforcement learning in robotics: A survey [J].
Kober, Jens ;
Bagnell, J. Andrew ;
Peters, Jan .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1238-1274
[8]  
Norcott W., 2006, IOZONE FILE SYSTEM B
[9]   Reinforcement learning of motor skills with policy gradients [J].
Peters, Jan ;
Schaal, Stefan .
NEURAL NETWORKS, 2008, 21 (04) :682-697
[10]  
Singh S., ADV NEURAL INFORM PR, P96