Scalable Queries For Large Datasets Using Cloud Computing: A Case Study

被引:0
|
作者
McGlothlin, James P. [1 ]
Khan, Latifur [1 ]
机构
[1] Univ Texas Dallas, Richardson, TX 75083 USA
关键词
Cloud Computing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing is rapidly growing in popularity as a solution for processing and retrieving huge amounts of data over clusters of inexpensive commodity hardware. The most common data model utilized by cloud computing software is the NoSQL data model. While this data model is extremely scalable, it is much more efficient for simple retrievals and scans than for the complex analytical queries typical in a relational database model. In this paper, we evaluate emerging cloud computing technologies using a representative use case. Our use case involves analyzing telecommunications logs for performance monitoring and quality assurance. Clearly, the size of such logs is growing exponentially as more devices communicate more frequently and the amount of data being transferred steadily increases. We analyze potential solutions to provide a scalable database which supports both retrieval and analysis. We will investigate and analyze all the major open source cloud computing solutions and designs. We then choose the most applicable subset of these technologies for experimentation. We provide a performance evaluation of these products, and we analyze our results and make recommendations. This paper provides a comprehensive survey of technologies for scalable data processing and an in-depth performance evaluation of these technologies.
引用
收藏
页码:8 / 16
页数:9
相关论文
共 50 条
  • [31] A Case Study on Algebraic Specification of Cloud Computing
    Liu, Dongmei
    Zhu, Hong
    Bayley, Ian
    PROCEEDINGS OF THE 2013 21ST EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING, 2013, : 269 - 273
  • [32] A Scalable Data Platform for Cloud Computing Systems
    Liu, Liang
    Wo, Tianyu
    Applied Decisions in Area of Mechanical Engineering and Industrial Manufacturing, 2014, 577 : 860 - 864
  • [33] Compression of Large genomic datasets using COMRAD on Parallel Computing Platform
    Biji, Christopher Leela
    Madhu, Manu K.
    Vishnu, Vineetha
    Satheesh, Kumar K.
    Vijayakumar
    Nair, Achuthsankar S.
    BIOINFORMATION, 2015, 11 (05) : 267 - 271
  • [34] Frequent Item Set Mining of Large Datasets Using CUDA Computing
    Karthik, Peddi
    Banu, J. Saira
    SOFT COMPUTING FOR PROBLEM SOLVING, SOCPROS 2018, VOL 2, 2020, 1057 : 739 - 747
  • [35] Scalable OLAP queries processing towards large cluster
    Wang, Hui-Ju
    Qin, Xiong-Pai
    Wang, Shan
    Zhang, Yan-Song
    Li, Fu-Rong
    Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (01): : 45 - 58
  • [36] Cloud Computing Service: The Case of Large Matrix Determinant Computation
    Lei, Xinyu
    Liao, Xiaofeng
    Huang, Tingwen
    Li, Huaqing
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2015, 8 (05) : 688 - 700
  • [37] Scalable risk assessment method for cloud computing using game theory (CCRAM)
    Furuncu, Evrim
    Sogukpinar, Ibrahim
    COMPUTER STANDARDS & INTERFACES, 2015, 38 : 44 - 50
  • [38] A Scalable Cloud Platform using Matlab Distributed Computing Server Integrated with HDFS
    Dutta, Rahul
    Annappa, B.
    2012 INTERNATIONAL SYMPOSIUM ON CLOUD AND SERVICES COMPUTING (ISCOS 2012), 2012, : 141 - 145
  • [39] Scalable Intrusion Detection Systems Log Analysis using Cloud Computing Infrastructure
    Kumar, Manish
    Hanumanthappa, M.
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 206 - 209
  • [40] CloudProteoAnalyzer: scalable processing of big data from proteomics using cloud computing
    Li, Jiancheng
    Xiong, Yi
    Feng, Shichao
    Pan, Chongle
    Guo, Xuan
    BIOINFORMATICS ADVANCES, 2024, 4 (01):