Developing a data pipeline solution for big data processing

被引:2
|
作者
Lipovac, Ivona [1 ]
Babac, Marina Bagic [1 ]
机构
[1] Univ Zagreb, Fac Elect Engn & Comp, Unska 3, HR-10000 Zagreb, Croatia
关键词
big data; data pipeline; data processing; data analysis; cloud computing;
D O I
10.1504/IJDMMM.2024.136221
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a comprehensive exploration of the concept of big data and its management while highlighting the challenges that arise in the process. The study showcases the development of a data pipeline, designed to facilitate big data collection, integration, and analysis while addressing state-of-the-art challenges, methods, tools, and technologies. Emphasis is placed on pipeline flexibility, with a view towards enabling ease of implementation of architecture changes, seamless integration of new sources, and straightforward implementation of additional transformations in existing pipelines as needed. The pipeline architecture is discussed in detail, with a focus on its design principles, components, and implementation details, as well as the mechanisms used to ensure its reliability, scalability, and performance. Results from a range of experiments demonstrate the pipeline's effectiveness in addressing the challenges of big data management and analysis, as well as its robustness and versatility in accommodating diverse data sources and processing requirements. This study provides insights into the critical role of data pipelines in enabling effective big data management and showcases the importance of flexibility in pipeline design to ensure adaptability to evolving data processing needs.
引用
收藏
页码:1 / 22
页数:23
相关论文
共 50 条
  • [21] BIG BICYCLE DATA PROCESSING: FROM PERSONAL DATA TO URBAN APPLICATIONS
    Pettit, C. J.
    Lieske, S. N.
    Leao, S. Z.
    XXIII ISPRS CONGRESS, COMMISSION II, 2016, 3 (02): : 173 - 179
  • [22] Big Data Processing Platform for smart city
    El Mendili, Saida
    El Bouzekri El Idrissi, Younes
    Hmina, Nabil
    2018 INTERNATIONAL SYMPOSIUM ON ADVANCED ELECTRICAL AND COMMUNICATION TECHNOLOGIES (ISAECT), 2018,
  • [23] Big Data Processing in Cloud Computing Environments
    Ji, Changqing
    Li, Yu
    Qiu, Wenming
    Awada, Uchechukwu
    Li, Keqiu
    PROCEEDINGS OF THE 2012 12TH INTERNATIONAL SYMPOSIUM ON PERVASIVE SYSTEMS, ALGORITHMS, AND NETWORKS (I-SPAN 2012), 2012, : 17 - 23
  • [24] Parallel Processing Strategies for Big Geospatial Data
    Werner, Martin
    FRONTIERS IN BIG DATA, 2019, 2
  • [25] Researches on Data Processing and Data Preventing Technologies in the Environment of Big Data in Power System
    Li, Nige
    Xu, Min
    Cao, Wantian
    Gao, Peng
    2015 5TH INTERNATIONAL CONFERENCE ON ELECTRIC UTILITY DEREGULATION AND RESTRUCTURING AND POWER TECHNOLOGIES (DRPT 2015), 2015, : 2491 - 2494
  • [26] Big Data Processing in Cloud Computing Environments
    Noraziah, A.
    Fakherldin, Mohammed Adam Ibrahim
    Adam, Khalid
    Majid, Mazlina Abdul
    ADVANCED SCIENCE LETTERS, 2017, 23 (11) : 11092 - 11095
  • [27] Categorical Big Data Processing
    Salvador-Meneses, Jaime
    Ruiz-Chavez, Zoila
    Garcia-Rodriguez, Jose
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2018, PT I, 2018, 11314 : 245 - 252
  • [28] Big Data Processing Solutions
    Barbu, Dragos Catahn
    ROMANIAN JOURNAL OF INFORMATION TECHNOLOGY AND AUTOMATIC CONTROL-REVISTA ROMANA DE INFORMATICA SI AUTOMATICA, 2019, 29 (02): : 35 - 48
  • [29] Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka
    Javed, M. Haseeb
    Lu, Xiaoyi
    Panda, Dhabaleswar K.
    BDCAT'17: PROCEEDINGS OF THE FOURTH IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES, 2017, : 1 - 10
  • [30] Semantic Processing on Big Data
    Qu, ZhenXin
    ADVANCES IN MULTIMEDIA, SOFTWARE ENGINEERING AND COMPUTING, VOL 2, 2011, 129 : 43 - 48