Efficient Big-Data Access: Taxonomy and a Comprehensive Survey

被引:4
作者
Alazzawe, Anis [1 ]
Pal, Amitangshu [1 ]
Kant, Krishna [1 ]
机构
[1] Temple Univ, Comp & Informat Sci, Philadelphia, PA 19122 USA
基金
美国国家科学基金会;
关键词
Metadata; Big Data; Nonvolatile memory; Taxonomy; Servers; Complexity theory; Bandwidth; Locality exploitation; proximity optimization; data reduction; redundancy removal; data filtering; DATA ANALYTICS; CLOUD STORAGE; CHALLENGES; MANAGEMENT;
D O I
10.1109/TBDATA.2020.3036813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The emerging systems are not only generating huge amounts of data but also expect this data to be analyzed expeditiously to drive online decision-making and control. Thus, identifying the most relevant data and making it available close to the computation becomes a central challenge in driving the big data revolution. Storage systems play a crucial role in enabling efficient access to the stored data and intelligent storage management techniques are thus central to addressing the problem. Generally, as the data volume increases, the marginal utility of an "average" data item tends to decline, which requires greater effort in identifying the most valuable data items and making them available with minimal overhead and latency. Data driven mechanisms have a big role to play in solving this needle-in-the-haystack problem. In this paper we propose a taxonomy to provide a structure for understanding the common issues surrounding these techniques. We discuss these techniques and articulate many research challenges and opportunities.
引用
收藏
页码:356 / 376
页数:21
相关论文
共 141 条
[11]  
Apache, 2019, HDFS ARCHITECTURE GU
[12]   A Survey Of Big Data Analytics in Healthcare and Government [J].
Archenaa, J. ;
Anita, E. A. Mary .
BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 :408-413
[13]   Erasure coding for distributed storage: an overview [J].
Balaji, S. B. ;
Krishnan, M. Nikhil ;
Vajha, Myna ;
Ramkumar, Vinayak ;
Sasidharan, Birenjith ;
Kumar, P. Vijay .
SCIENCE CHINA-INFORMATION SCIENCES, 2018, 61 (10)
[14]   A roadmap for privacy-enhanced secure data provenance [J].
Bertino, Elisa ;
Ghinita, Gabriel ;
Kantarcioglu, Murat ;
Dang Nguyen ;
Park, Jae ;
Sandhu, Ravi ;
Sultana, Salmin ;
Thuraisingham, Bhavani ;
Xu, Shouhuai .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2014, 43 (03) :481-501
[15]  
Borthakur D., HADOOP DISTRIBUTED F
[16]   Semantics-Driven Optimistic Data Replication Towards a Framework Supporting Software Architects and Developers [J].
Braun, Susanne .
2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE WORKSHOPS (ICSAW), 2017, :238-243
[17]   CAP Twelve Years Later: How the "Rules" Have Changed [J].
Brewer, Eric .
COMPUTER, 2012, 45 (02) :23-29
[18]   Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives [J].
Cai, Yu ;
Ghose, Saugata ;
Haratsch, Erich F. ;
Luo, Yixin ;
Mutlu, Onur .
PROCEEDINGS OF THE IEEE, 2017, 105 (09) :1666-1704
[19]   A Survey on Big Data Analytics Solutions Deployment [J].
Castellanos, Camilo ;
Perez, Boris ;
Varela, Carlos A. ;
Villamil, Maria del Pilar ;
Correal, Dario .
SOFTWARE ARCHITECTURE, ECSA 2019, 2019, 11681 :195-210
[20]   Locally adaptive dimensionality reduction for indexing large time series databases [J].
Chakrabarti, K ;
Keogh, E ;
Mehrotra, S ;
Pazzani, M .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2002, 27 (02) :188-228