Towards a framework for large-scale multimedia data storage and processing on Hadoop platform

被引:20
作者
Lai, Wei Kuang [1 ]
Chen, Yi-Uan [1 ]
Wu, Tin-Yu [2 ]
Obaidat, Mohammad S. [3 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 80424, Taiwan
[2] Natl Ilan Univ, Dept Comp Sci & Informat Engn, Ilan, Taiwan
[3] Monmouth Univ, Dept Comp Sci & Software Engn, Monmouth Jct, NJ 07764 USA
关键词
Cloud computing; Hadoop; MapReduce; BigTable; High performance computing;
D O I
10.1007/s11227-013-1050-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing techniques take the form of distributed computing by utilizing multiple computers to execute computing simultaneously on the service side. To process the increasing quantity of multimedia data, numerous large-scale multimedia data storage computing techniques in the cloud computing have been developed. Of all the techniques, Hadoop plays a key role in the cloud computing. Hadoop, a computing cluster formed by low-priced hardware, can conduct the parallel computing of petabytes of multimedia data. Hadoop features high-reliability, high-efficiency, and high-scalability. The numerous large-scale multimedia data computing techniques include not only the key core techniques, Hadoop and MapReduce, but also the data collection techniques, such as File Transfer Protocol and Flume. In addition, distributed system configuration allocation, automatic installation, and monitoring platform building and management techniques are all included. As a result, only with the integration of all the techniques, a reliable large-scale multimedia data platform can be offered. In this paper, we introduce how cloud computing can make a breakthrough by proposing a multimedia social network dataset on Hadoop platform and implementing a prototype version. Detailed specifications and design issues are discussed as well. An important finding of this article is that we can save more time if we conduct the multimedia social networking analysis using Cloud Hadoop Platform rather than using a single computer. The advantages of cloud computing over the traditional data processing practices are fully demonstrated in this article. The applicable framework designs and the tools available for the large-scale data processing are also proposed. We show the experimental multimedia data including data sizes and processing time.
引用
收藏
页码:488 / 507
页数:20
相关论文
共 21 条
[1]  
[Anonymous], 2010, HDFS ARCH GUID
[2]  
[Anonymous], 2013, AP SQOOP TM IS TOOL
[3]  
Athale R, 2012, Int J Comput Sci, P360, DOI [10.48550/arXiv.1207.3037, DOI 10.48550/ARXIV.1207.3037]
[4]  
Babu S., 2010, P 1 ACM S CLOUD COMP
[5]  
Bennett Collin, 2010, KDD
[6]   SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets [J].
Chaiken, Ronnie ;
Jenkins, Bob ;
Larson, Per-Ake ;
Ramsey, Bill ;
Shakib, Darren ;
Weaver, Simon ;
Zhou, Jingren .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02) :1265-1276
[7]  
Chang F, 2006, USENIX ASSOCIATION 7TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P205
[8]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[9]  
Eavis Todd, 2012, Database Systems for Advanced Applications. Proceedings 17th International Conference, DASFAA 2012, P185, DOI 10.1007/978-3-642-29035-0_13
[10]  
Ghemawat S., 2003, P 19 ACM S OPERATING, P20