Optimizing Big Data Retrieval and Job Scheduling Using Deep Learning Approaches

被引:2
作者
Chang, Bao Rong [1 ]
Tsai, Hsiu-Fen [2 ]
Lin, Yu-Chieh [1 ]
机构
[1] Natl Univ Kaohsiung, Dept Comp Sci & Informat Engn, Kaohsiung, Taiwan
[2] Kaohsiung Med Univ, Dept Fragrance & Cosmet Sci, Kaohsiung, Taiwan
来源
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2023年 / 134卷 / 02期
关键词
Stacked sparse autoencoder; Elasticsearch; distributed indexing; data retrieval; deep neural network; job scheduling; TIME;
D O I
10.32604/cmes.2022.020128
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput. This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems. First, integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing, which reduces the search scope of the database and dramatically speeds up data searching. Next, exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first, which reduces the average waiting time of job execution. As a result, the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing, significantly improving the speed of data retrieval up to 53% and increasing system throughput by 53%. On the other hand, the proposed job scheduling algorithm defeats both first-in-first-out and memory-sensitive heterogeneous early finish time scheduling algorithms, effectively shortening the average waiting time up to 5% and average weighted turnaround time by 19%, respectively.
引用
收藏
页码:783 / 815
页数:33
相关论文
共 42 条
  • [1] Applications, Deployments, and Integration of Internet of Drones (IoD): A Review
    Abualigah, Laith
    Diabat, Ali
    Sumari, Putra
    Gandomi, Amir H.
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (22) : 25532 - 25546
  • [2] Reptile Search Algorithm (RSA): A nature-inspired meta-heuristic optimizer
    Abualigah, Laith
    Abd Elaziz, Mohamed
    Sumari, Putra
    Geem, Zong Woo
    Gandomi, Amir H.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [3] Aquila Optimizer: A novel meta-heuristic optimization algorithm
    Abualigah, Laith
    Yousri, Dalia
    Abd Elaziz, Mohamed
    Ewees, Ahmed A.
    Al-qaness, Mohammed A. A.
    Gandomi, Amir H.
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 157 (157)
  • [4] Memory Partitioning and Management in Memcached
    Carra, Damiano
    Michiardi, Pietro
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2019, 12 (04) : 564 - 576
  • [5] Centers for Disease Control and Prevention, 2021, NUTR PHYS ACT OB BEH
  • [6] Chang, 2018, DEEP LEARNING BASED, DOI [10.1155/2021/9022558, DOI 10.1155/2021/9022558]
  • [7] Integrated High-Performance Platform for Fast Query Response in Big Data with Hive, Impala, and SparkSQL: A Performance Evaluation
    Chang, Bao Rong
    Tsai, Hsiu-Fen
    Lee, Yun-Da
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (09):
  • [8] Development of Multiple Big Data Analytics Platforms with Rapid Response
    Chang, Bao Rong
    Lee, Yun-Da
    Liao, Po-Hao
    [J]. SCIENTIFIC PROGRAMMING, 2017, 2017
  • [9] Time Series Data for Equipment Reliability Analysis With Deep Learning
    Chen, Baotong
    Liu, Yan
    Zhang, Chunhua
    Wang, Zhongren
    [J]. IEEE ACCESS, 2020, 8 (08) : 105484 - 105493
  • [10] Real-Time or Near Real-Time Persisting Daily Healthcare Data Into HDFS and ElasticSearch Index Inside a Big Data Platform
    Chen, Dequan
    Chen, Yi
    Brownlow, Brian N.
    Kanjamala, Pradip P.
    Arredondo, Carlos A. Garcia
    Radspinner, Bryan L.
    Raveling, Matthew A.
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (02) : 595 - 606