Processing Big Data with Apache Hadoop in the Current Challenging Era of COVID-19

被引:15
作者
Azeroual, Otmane [1 ]
Fabre, Renaud [2 ]
机构
[1] German Ctr Higher Educ Res & Sci Studies DZHW, D-10117 Berlin, Germany
[2] Univ Paris 08, Dionysian Econ Lab LED, F-93200 St Denis, France
关键词
big data; data processing; unstructured data; large amounts of data; COVID-19; challenges; Hadoop technology; MapReduce; WordCount; ANALYTICS;
D O I
10.3390/bdcc5010012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Big data have become a global strategic issue, as increasingly large amounts of unstructured data challenge the IT infrastructure of global organizations and threaten their capacity for strategic forecasting. As experienced in former massive information issues, big data technologies, such as Hadoop, should efficiently tackle the incoming large amounts of data and provide organizations with relevant processed information that was formerly neither visible nor manageable. After having briefly recalled the strategic advantages of big data solutions in the introductory remarks, in the first part of this paper, we focus on the advantages of big data solutions in the currently difficult time of the COVID-19 pandemic. We characterize it as an endemic heterogeneous data context; we then outline the advantages of technologies such as Hadoop and its IT suitability in this context. In the second part, we identify two specific advantages of Hadoop solutions, globality combined with flexibility, and we notice that they are at work with a "Hadoop Fusion Approach" that we describe as an optimal response to the context. In the third part, we justify selected qualifications of globality and flexibility by the fact that Hadoop solutions enable comparable returns in opposite contexts of models of partial submodels and of models of final exact systems. In part four, we remark that in both these opposite contexts, Hadoop's solutions allow a large range of needs to be fulfilled, which fits with requirements previously identified as the current heterogeneous data structure of COVID-19 information. In the final part, we propose a framework of strategic data processing conditions. To the best of our knowledge, they appear to be the most suitable to overcome COVID-19 massive information challenges.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] In-Memory Parallel Processing of Massive Remotely Sensed Data Using an Apache Spark on Hadoop YARN Model
    Huang, Wei
    Meng, Lingkui
    Zhang, Dongying
    Zhang, Wen
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (01) : 3 - 19
  • [33] Towards Efficient Big Data: Hadoop Data Placing and Processing
    Bahadi, Jihane
    El Asri, Bouchra
    Courtine, Melanie
    Rhanoui, Maryem
    Kergosien, Yannick
    2ND INTERNATIONAL CONFERENCE ON SMART DIGITAL ENVIRONMENT (ICSDE'18), 2018, : 42 - 47
  • [34] Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications
    Al-Absi, Ahmed Abdulhakim
    Kang, Dae-Ki
    Kim, Myong-Jong
    ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING: FUTURE INFORMATION TECHNOLOGY, VOL 2, 2016, 354 : 9 - 15
  • [35] Performance Modeling and Analysis of a Hadoop Cluster for Efficient Big Data Processing
    Lim, JongBeom
    Ahnh, Jong-Suk
    Lee, Kang-Woo
    ADVANCED SCIENCE LETTERS, 2016, 22 (09) : 2314 - 2319
  • [36] A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark
    Ramirez-Gallego, S.
    Garcia, S.
    Benitez, J. M.
    Herrera, F.
    SWARM AND EVOLUTIONARY COMPUTATION, 2018, 38 : 240 - 250
  • [37] Processing Real World Datasets using Big Data Hadoop Tools
    Deshai, N.
    Sekhar, B. V. D. S.
    Reddy, P. V. G. D. Prasad
    Chakravarthy, V. V. S. S. S.
    JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2020, 79 (07): : 631 - 635
  • [38] The Covid-19 Influence on the Desire to Stay at Home: A Big Data Architecture
    Sousa, Regina
    Oliveira, Daniela
    Carneiro, Ana
    Pinto, Luis
    Pereira, Ana
    Peixoto, Ana
    Peixoto, Hugo
    Machado, Jose
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2022, 2022, 13756 : 199 - 210
  • [39] Human behavior in the time of COVID-19: Learning from big data
    Lyu, Hanjia
    Imtiaz, Arsal
    Zhao, Yufei
    Luo, Jiebo
    FRONTIERS IN BIG DATA, 2023, 6
  • [40] Big Data COVID-19 Systematic Literature Review: Pandemic Crisis
    Haafza, Laraib Aslam
    Awan, Mazhar Javed
    Abid, Adnan
    Yasin, Awais
    Nobanee, Haitham
    Farooq, Muhammad Shoaib
    ELECTRONICS, 2021, 10 (24)