Modeling SmallClient indexing framework for big data analytics

被引：3

作者：

Siddiqa, Aisha ^{[1
]}

Karim, Ahmad ^{[2
]}

Chang, Victor ^{[3
]}

机构：

[1] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia

[2] Bahauddin Zakariya Univ, Dept Informat Technol, Multan 60000, Pakistan

[3] Xian Jiaotong Liverpool Univ, IBSS, Suzhou 100044, Peoples R China

来源：

JOURNAL OF SUPERCOMPUTING | 2018年 / 74卷 / 10期

关键词：

Big data indexing; Big data analytics; Indexing; Big data; CLOUD; EFFICIENT; PERFORMANCE; STORAGE;

D O I：

10.1007/s11227-017-2052-4

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Continually growing big data by the intervention of electronic and automated devices affects the data retrieval performance of contemporary big data analytics technologies and makes exploration and adoption of improved procedures inevitable. Indexing on big data facilitates analytics in a way that it can store, process, access and analyze given data sets quickly and more efficiently once properly designed. This paper aims to propose a novel mathematical model that introduces an indexing mechanism and ensures improved data retrieval performance on data sets with support to growing volume of big data. The model is composed of three modules: block creation, index creation and query execution. Block creation module improves records access performance while avoiding remote access delays. Index creation module allows maximum possible indexes for big data with minimized indexing overhead. Query execution module performs data search and retrieval operation on user search queries. The evaluation of proposed mathematical model ensures that search performance for both small and big data sets is improved with minimized overhead of data uploading and indexing time. We further verify the results by implementing SmallClient logic on four-node physical cluster that prove the improved performance of proposed approach.

引用

页码：5241 / 5262

页数：22

共 33 条

[1]

[Anonymous], 2007, CIDR, DOI [10.1002/per, DOI 10.1002/PER]

[2]

[Anonymous], CLUSTER COMPUT

[3]

[Anonymous], P IEEE INT C DAT ENG

[4]

[Anonymous], INT J BIG DATA INTEL

[5]

[Anonymous], ARXIV12123480

[6]

[Anonymous], 2009, Proceedings of the VLDB Endowment

[7]

Aye K.N., 2015, Int. J. Big Data Intell., V2, P127

[8]

Borthakur D., 2008, Hadoop Apache Project, V53, P2

[9] A model to compare cloud and non-cloud storage of Big Data [J].

Chang, Victor ;

Wills, Gary .

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 57 :56-76

[10] Towards a Big Data system disaster recovery in a Private Cloud [J].

Chang, Victor .

AD HOC NETWORKS, 2015, 35 :65-82

← 1 2 3 4 →