Efficient, Reliable, and Scalable Distributed Data Processing Based on Hadoop

被引:0
作者
Liang, Yingwei [1 ]
Zhu, Taipeng [1 ]
Liu, Chenghui [1 ]
Chen, Yue [1 ]
Yang, Yufei [1 ]
机构
[1] CSG Guangdong Power Grid Co Ltd, Informat Ctr, Guangzhou 510000, Guangdong, Peoples R China
来源
2024 INTERNATIONAL CONFERENCE ON POWER, ELECTRICAL ENGINEERING, ELECTRONICS AND CONTROL, PEEEC | 2024年
关键词
Hadoop; Distributed data processing; high efficiency; reliability; scalability; SPARK;
D O I
10.1109/PEEEC63877.2024.00067
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of information technology and the intelligent upgrading of terminal devices, the amount of data in modern society is showing an explosive growth trend. These data are not only large-scale, but also structurally diverse. This data characteristic poses significant challenges to traditional data processing and storage methods. Although traditional distributed database and data warehouse technologies can to some extent cope with large-scale data processing, efficiency and flexibility are still limited when dealing with unstructured and massive data. In this context, the emergence of Hadoop and Map Reduce technologies has provided new solutions for large-scale data processing. Hadoop has become one of the mainstream technologies for handling big data due to its simple programming model, good scalability, and powerful fault tolerance. This article designs a distributed data processing system based on Hadoop, which fully utilizes the efficiency, high reliability, and high scalability of Hadoop, aiming to meet the needs of large-scale data processing. The experimental results show that the system in this paper has demonstrated excellent performance and stability in practical applications, and can process massive data, providing an effective solution for large-scale data processing.
引用
收藏
页码:334 / 339
页数:6
相关论文
共 14 条
[1]  
Faisal K, Journal of Synchrotron Radiation, V25, P1135
[2]  
Haseeb K, 2020, IEEE Access, V9, P1
[3]   Canny edge detection and Hough transform for high resolution video streams using Hadoop and Spark [J].
Iqbal, Bilal ;
Iqbal, Waheed ;
Khan, Nazar ;
Mahmood, Arif ;
Erradi, Abdelkarim .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (01) :397-408
[4]   RETRACTED: Design of an Interactive Two-Way Telemedicine Service System for Smart Home Care for the Elderly (Retracted Article) [J].
Li, Fang .
JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
[5]   Scalability and performance analysis of BDPS in clouds [J].
Li, Yuegang ;
Ou, Dongyang ;
Zhou, Xin ;
Jiang, Congfeng ;
Cerin, Christophe .
COMPUTING, 2022, 104 (06) :1425-1460
[6]   Simulating aircraft landing and take off scheduling in distributed framework environment using Hadoop file system [J].
Nazini, H. ;
Sasikala, T. .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 6) :13463-13471
[7]  
Prado F C, 2021, The Journal of Supercomputing, V77, P1
[8]   Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem [J].
Rathore, M. Mazhar ;
Son, Hojae ;
Ahmad, Awais ;
Paul, Anand ;
Jeon, Gwanggil .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (03) :630-646
[9]  
Rochd Y, 2018, INT J COMPUT SCI NET, V18, P161
[10]   IoMT Platform for Pervasive Healthcare Data Aggregation, Processing, and Sharing Based on OneM2M and OpenEHR [J].
Rubi, Jesus N. S. ;
Gondim, Paulo R. L. .
SENSORS, 2019, 19 (19)