Benchmarking large-scale data management for Internet of Things

被引:8
作者
Hendawi, Abdeltawab [1 ,2 ]
Gupta, Jayant [3 ]
Liu, Jiayi [6 ]
Teredesai, Ankur [7 ]
Ramakrishnan, Naveen [8 ]
Shah, Mohak [6 ]
El-Sappagh, Shaker [4 ,5 ]
Kwak, Kyung-Sup [4 ]
Ali, Mohamed [7 ]
机构
[1] Univ Rhode Isl, Dept Comp Sci & Stat, Kingston, RI 02881 USA
[2] Cairo Univ, Fac Comp & Informat, Giza, Egypt
[3] Univ Minnesota, Comp Sci & Engn, Minneapolis, MN USA
[4] Inha Univ, Dept Informat & Commun Engn, Incheon, South Korea
[5] Benha Univ, Fac Comp & Informat, Informat Syst Dept, Kaliobeya, Egypt
[6] LG Elect, Seoul, South Korea
[7] Univ Washington, Ctr Data Sci, Tacoma, WA USA
[8] Robert Bosch LLC, Ctr AI, Palo Alto, CA USA
基金
新加坡国家研究基金会;
关键词
Benchmarking; NoSQL; Distributed data management; Parallel data management; Internet of things (IoT); MongoDB; Cassandra; HBase; CHALLENGES;
D O I
10.1007/s11227-019-02984-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the current era of the Internet of Things (IoT), massive number of sensors are used in our daily lives. Sensors are everywhere around us. They exist in our homes, work places, streets, cars, and even ourselves. Examples include home appliances, wearable devices, and medical sensors. These sensors generate huge amount of dynamic, heterogeneous, and unstructured data that need special handling beyond the capabilities of conventional relational databases. Thus, identification of suitable data management platform to store and query this data is necessary. Despite of its popularity and efficiency in processing various types of big data, there is no single-guided study of how NoSQL data stores will behave with the Internet of Things (IoT) datasets. IoT data have its own characteristics that make it special. IoT data come from various sensors, with a wide range of formats, high velocity, and require high throughput processing with low latency. NoSQL data stores are commonly used to provide flexibility and availability for big data handling. However, there is a lack of comprehensive studies about which NoSQL data store performs the best from the two scalability aspects (scale-up and scale-out) in a distributed and parallel processing environment. This paper benchmarks the commonly used NoSQL data stores (MongoDB, Cassandra, and HBase), and compares their performance with real industrial IoT dataset. In addition, we focus on comparing the throughput, latency, and run time of the evaluated NoSQL data stores.
引用
收藏
页码:8207 / 8230
页数:24
相关论文
共 45 条
[1]  
Abramov V. E., 2013, 14th Scientific Conference on the "theory and practice of the struggle against parasitic diseases", Moscow, Russia, 21-23 May 2013, P14
[2]  
Adrian M., 2016, GARTNER
[3]   Handling big data: research challenges and future directions [J].
Anagnostopoulos, I. ;
Zeadally, S. ;
Exposito, E. .
JOURNAL OF SUPERCOMPUTING, 2016, 72 (04) :1494-1516
[4]  
[Anonymous], 2013, Proceedings of the 4th ACM workshop on Scientific cloud computing, DOI DOI 10.1145/2465848.2465849
[5]  
[Anonymous], 2000, PODC
[6]  
[Anonymous], 2012, Int J Comput Appl
[7]  
[Anonymous], 2014, OPEN J DATABASES
[8]  
Aslett M, 2015, NOSQL NUMBERS
[9]   Performance evaluation of NoSQL big-data applications using multi-formalism models [J].
Barbierato, Enrico ;
Gribaudo, Marco ;
Iacono, Mauro .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 37 :345-353
[10]  
Boral H, 1984, METHODOLOGY DATABASE, V14