Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage

被引:1
作者
Davies-Tagg, Dominic [1 ]
Anjum, Ashiq [2 ]
Zahir, Ali [2 ]
Liu, Lu [2 ]
Yaseen, Muhammad Usman [3 ]
Antonopoulos, Nick [4 ]
机构
[1] Univ Derby, Dept Comp, Derby DE22 1GB, England
[2] Univ Leicester, Dept Informat, Leicester LE1 7RH, England
[3] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad 45550, Pakistan
[4] Edinburgh Napier Univ, Edinburgh EH11 4BN, Scotland
关键词
data temperature; hot and cold data; multi-tiered storage; metadata variable; multi-temperature system;
D O I
10.26599/BDMA.2023.9020039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data temperature is a response to the ever-growing amount of data. These data have to be stored, but they have been observed that only a small portion of the data are accessed more frequently at any one time. This leads to the concept of hot and cold data. Cold data can be migrated away from high-performance nodes to free up performance for higher priority data. Existing studies classify hot and cold data primarily on the basis of data age and usage frequency. We present this as a limitation in the current implementation of data temperature. This is due to the fact that age automatically assumes that all new data have priority and that usage is purely reactive. We propose new variables and conditions that influence smarter decision-making on what are hot or cold data and allow greater user control over data location and their movement. We identify new metadata variables and user-defined variables to extend the current data temperature value. We further establish rules and conditions for limiting unnecessary movement of the data, which helps to prevent wasted input output (I/O) costs. We also propose a hybrid algorithm that combines existing variables and new variables and conditions into a single data temperature. The proposed system provides higher accuracy, increases performance, and gives greater user control for optimal positioning of data within multi-tiered storage solutions.
引用
收藏
页码:371 / 398
页数:28
相关论文
共 23 条
[1]   Effect of lime and ferrochrome ash (FA) as partial replacement of cement on strength, ultrasonic pulse velocity and permeability of concrete [J].
Acharya, Prasanna K. ;
Patro, Sanjaya K. .
CONSTRUCTION AND BUILDING MATERIALS, 2015, 94 :448-457
[2]   KiWi: A Key-value Map for Scalable Real-time Analytics [J].
Basin, Dmitry ;
Bortnikov, Edward ;
Braginsky, Anastasia ;
Golan-Gueta, Guy ;
Hillel, Eshcar ;
Keidar, Idit ;
Sulamy, Moshe .
ACM TRANSACTIONS ON PARALLEL COMPUTING, 2020, 7 (03)
[3]  
Buyya R., 2011, Proceedings of the 2011 International Conference on Cloud and Service Computing (CSC 2011), P1, DOI 10.1109/CSC.2011.6138522
[4]   Modern Triage in the Emergency Department [J].
Christ, Michael ;
Grossmann, Florian ;
Winter, Daniela ;
Bingisser, Roland ;
Platz, Elke .
DEUTSCHES ARZTEBLATT INTERNATIONAL, 2010, 107 (50) :892-U20
[5]   Synergy between sequence and size in large-scale genomics [J].
Gregory, TR .
NATURE REVIEWS GENETICS, 2005, 6 (09) :699-708
[6]  
Guerra J., 2011, P 9ST USENIX C FAST, P20
[7]  
Karun AK, 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), P132
[8]  
Prasun Gupta e., 2007, Proceedings of the 35th Annual ACM SIGUCCS Fall Conference, SIGUCCS07, pAg, P146
[9]   Methods to account for movement and flexibility in cryo-EM data processing [J].
Rawson, S. ;
Iadanza, M. G. ;
Ranson, N. A. ;
Muench, S. P. .
METHODS, 2016, 100 :35-41
[10]   Polynomial Time Complexity of Edge Colouring Graphs with Bounded Colour Classes [J].
Rizzi, Romeo ;
Cariolaro, David .
ALGORITHMICA, 2014, 69 (03) :494-500