Towards an Intelligent Framework for Scientific Computational Steering in Big Data Systems

被引:0
作者
Zhang, Yijie [1 ]
Wu, Chase Q. [1 ]
机构
[1] New Jersey Inst Technol, Dept Data Sci, Newark, NJ 07102 USA
来源
2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024 | 2024年
关键词
Big Data; Computational Steering; Parameter Tuning; Machine Learning; Scientific Innovation; FILE; HDFS;
D O I
10.1109/CCGrid59990.2024.00085
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Scientific applications of the next generation are undergoing a paradigm shift, transitioning from traditional experiment-centric methodologies to extreme-scale simulation-centric computations. These simulations, characterized by intricate numerical modeling with numerous adjustable parameters, generate vast datasets that necessitate meticulous processing and analysis against experimental or observational data for parameter calibration and model validation. However, manual parameter adjustment by domain experts in complex and distributed environments proves impractical. To address this challenge, we propose an online computational steering service facilitating real-time multi-user interaction. Towards this end, we design a versatile steering framework and conduct a theoretical performance evaluation of the steering service empowered by machine learning techniques. Furthermore, we present a case study involving the Weather Research and Forecast (WRF) model, comparing the performance of our steering solution with alternative heuristic methods and default settings to demonstrate its efficacy. The processing of big data generated by scientific simulations typically requires the use of big data systems as exemplified by Hadoop with Hadoop Distributed File System (HDFS) serving as a foundational technology layer. HDFS supports parallel computing in upper layers, offering fault tolerance and high throughput in data storage through block replication and cluster-wide distribution. However, the default block distribution strategy in HDFS overlooks the diverse capacities and data access patterns of nodes in heterogeneous Hadoop clusters, rendering it suboptimal for such environments. To address this issue, we formulate a class of block distribution problems in heterogeneous clusters, establishing its NP-completeness, and design an approximate algorithm, LPIR-BD, which leverages linear programming-based iterative rounding with a rigorous performance guarantee. Extensive experimental evaluations demonstrate the superior performance of LPIR-BD over several existing algorithms, corroborating our theoretical analyses and underscoring its efficacy in heterogeneous clusters.
引用
收藏
页码:671 / 675
页数:5
相关论文
共 50 条
[21]   FENCE: Fast, ExteNsible, and ConsolidatEd Framework for Intelligent Big Data Processing [J].
Ramneek ;
Cha, Seung-Jun ;
Pack, Sangheon ;
Jeon, Seung Hyub ;
Jeong, Yeon Jeong ;
Kim, Jin Mee ;
Jung, Sungin .
IEEE ACCESS, 2020, 8 :125423-125437
[22]   Design of an intelligent financial management framework for enterprises based on big data [J].
Lu, Tedan .
INTERNATIONAL JOURNAL OF CRITICAL INFRASTRUCTURES, 2025, 21 (03)
[23]   Towards a Conceptual Framework for Customer Intelligence in the Era of Big Data [J].
Nguyen Anh Khoa Dam ;
Thang Le Dinh ;
Menvielle, William .
INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2021, 17 (04)
[24]   Strategic Positioning in Big Data Utilization: Towards a Conceptual Framework [J].
Wiren, Milla ;
Mantymaki, Matti .
CHALLENGES AND OPPORTUNITIES IN THE DIGITAL ERA, 2018, 11195 :117-128
[25]   Hybrid Computational Steering for Dynamic Data-Driven Application Systems [J].
Han, Junyi ;
Brooke, John .
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 :407-417
[26]   Creating Intelligent Business Systems by Utilising Big Data and Semantics [J].
Quboa, Qudamah ;
Mehandjiev, Nikolay .
2017 IEEE 19TH CONFERENCE ON BUSINESS INFORMATICS (CBI), VOL 2, 2017, 2 :39-46
[27]   Intelligent Evaluation Method for Complex Systems in The Big Data Environment [J].
Kavun, Sergii ;
Zamula, Alina ;
Miziurin, Valerii .
2019 IEEE 2ND UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON-2019), 2019, :951-956
[28]   Big Data in Intelligent Health Monitoring Systems for Elder Care [J].
Cai, Xi ;
Wang, Jinkuan ;
Li, Meimei ;
Han, Guang .
INTERNATIONAL CONFERENCE ON ELECTRICAL AND CONTROL ENGINEERING (ICECE 2015), 2015, :908-911
[29]   Big Data & Inductive Theory Development: Towards Computational Grounded Theory? [J].
Berente, Nicholas ;
Seidel, Stefan .
AMCIS 2014 PROCEEDINGS, 2014,
[30]   Towards Big Data Security Framework by Leveraging Fragmentation and Blockchain Technology [J].
Alhazmi, Hanan E. ;
Eassa, Fathy E. ;
Sandokji, Suhelah M. .
IEEE ACCESS, 2022, 10 :10768-10782