Scalable Queries For Large Datasets Using Cloud Computing: A Case Study

被引:0
|
作者
McGlothlin, James P. [1 ]
Khan, Latifur [1 ]
机构
[1] Univ Texas Dallas, Richardson, TX 75083 USA
关键词
Cloud Computing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing is rapidly growing in popularity as a solution for processing and retrieving huge amounts of data over clusters of inexpensive commodity hardware. The most common data model utilized by cloud computing software is the NoSQL data model. While this data model is extremely scalable, it is much more efficient for simple retrievals and scans than for the complex analytical queries typical in a relational database model. In this paper, we evaluate emerging cloud computing technologies using a representative use case. Our use case involves analyzing telecommunications logs for performance monitoring and quality assurance. Clearly, the size of such logs is growing exponentially as more devices communicate more frequently and the amount of data being transferred steadily increases. We analyze potential solutions to provide a scalable database which supports both retrieval and analysis. We will investigate and analyze all the major open source cloud computing solutions and designs. We then choose the most applicable subset of these technologies for experimentation. We provide a performance evaluation of these products, and we analyze our results and make recommendations. This paper provides a comprehensive survey of technologies for scalable data processing and an in-depth performance evaluation of these technologies.
引用
收藏
页码:8 / 16
页数:9
相关论文
共 50 条
  • [21] Scalable and portable visualization of large atomistic datasets
    Sharma, A
    Kalia, RK
    Nakano, A
    Vashishta, P
    COMPUTER PHYSICS COMMUNICATIONS, 2004, 163 (01) : 53 - 64
  • [22] Scalable Distributed Data Anonymization for Large Datasets
    di Vimercati, Sabrina De Capitani
    Facchinetti, Dario
    Foresti, Sara
    Livraga, Giovanni
    Oldani, Gianluca
    Paraboschi, Stefano
    Rossi, Matthew
    Samarati, Pierangela
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (03) : 818 - 831
  • [23] Scalable Computation of Streamlines on Very Large Datasets
    Pugmire, Dave
    Childs, Hank
    Garth, Christoph
    Ahern, Sean
    Weber, Gunther H.
    PROCEEDINGS OF THE CONFERENCE ON HIGH PERFORMANCE COMPUTING NETWORKING, STORAGE AND ANALYSIS, 2009,
  • [24] An Efficient Algorithm for Skyline Queries in Cloud Computing Environments
    Huang, Zhenhua
    Xu, Weicheng
    Cheng, Jiujun
    Ni, Juan
    CHINA COMMUNICATIONS, 2018, 15 (10) : 182 - 193
  • [25] An Efficient Algorithm for Skyline Queries in Cloud Computing Environments
    Zhenhua Huang
    Weicheng Xu
    Jiujun Cheng
    Juan Ni
    中国通信, 2018, 15 (10) : 182 - 193
  • [26] Survey of privacy preserving data queries in cloud computing
    Xiao, Ren-Yi
    Tongxin Xuebao/Journal on Communications, 2014, 35 (12): : 168 - 177
  • [27] Multiple Queries Optimization for Data Streams on Cloud Computing
    Najib, Fatma M.
    Ismail, Rasha M.
    Badr, Nagwa L.
    Tolba, M. F.
    2015 TENTH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2015, : 28 - 33
  • [28] Balancing the Load of BI Queries in a Cloud Computing Environment
    Chung, Chen-Yao
    Hsu, Ping-Yu
    Wu, Chuan-Sheng
    Lu, Ruei-Shan
    Ting, Ping-Ho
    JOURNAL OF INTERNET TECHNOLOGY, 2014, 15 (01): : 87 - 101
  • [29] A Cloud Computing Architecture Framework for Scalable RFID
    Dabas, Chetna
    Gupta, J. P.
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 441 - 444
  • [30] Exergy Consumption of Cloud Computing: A Case Study
    Aleksic, Slavisa
    Safaei, Mehdi
    2014 19TH EUROPEAN CONFERENCE ON NETWORKS AND OPTICAL COMMUNICATIONS - (NOC), 2014, : 1 - 6