Methodology of Data Popularity Forecasting in High-Energy Physics Experiments on Unbalanced and Irregular Time-series Data

被引:0
|
作者
Grigorieva, M. A. [1 ,2 ]
Popova, N. N. [2 ]
Vartanov, D. A. [2 ]
Shubin, M. V. [2 ]
机构
[1] Moscow Ctr Fundamental & Appl Math, Moscow 119234, Russia
[2] Lomonosov Moscow State Univ, Moscow 119991, Russia
关键词
data popularity; high-energy physics; distributed computing; machine learning; predictive analytics; time series analysis;
D O I
10.1134/S1995080224603771
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
This study introduces a method to forecast data popularity in high energy physics (HEP) experiments, focusing on unbalanced and irregular time-series data. The goal is to predict the popularity of specific datasets accurately over time, which is crucial for optimizing data replication and placement strategies and enhancing distributed computing efficiency in HEP experiments. The methodology utilizes advanced machine learning techniques and time-series analysis to tackle the challenges posed by the unbalanced nature of the data. The paper outlines the key components of the methodology, including data preprocessing and balancing techniques, filtration, and model selection. To evaluate the effectiveness of the presented approach, the authors conduct experiments on real-world HEP datasets, comparing their predictions against actual data. The findings of this study have important implications for resource management and decision-making in distributed computing of various large-scale scientific projects. By providing forecasts of data popularity, researchers and administrators can efficiently allocate resources, optimize data storage and retrieval mechanisms, and improve overall data processing efficiency.
引用
收藏
页码:3072 / 3084
页数:13
相关论文
共 50 条
  • [21] Modelling High-Energy Physics Data Transfers
    Bogado, Joaquin
    Monticelli, Fernando
    Diaz, Javier
    Lassnig, Mario
    Vukotic, Ilija
    2018 IEEE 14TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE 2018), 2018, : 334 - 335
  • [22] Data and computing for high energy physics experiments
    Chen Gang
    SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2021, 51 (09)
  • [23] Data Decomposition Based Learning for Load Time-Series Forecasting
    Bedi, Jatin
    Toshniwal, Durga
    ECML PKDD 2020 WORKSHOPS, 2020, 1323 : 62 - 74
  • [24] Bayesian forecasting of demand time-series data with zero values
    Corberan-Vallet, Ana
    Bermudez, Jose D.
    Vercher, Enriqueta
    EUROPEAN JOURNAL OF INDUSTRIAL ENGINEERING, 2013, 7 (06) : 777 - 796
  • [25] CUTS: NEURAL CAUSAL DISCOVERY FROM IRREGULAR TIME-SERIES DATA
    Cheng, Yuxiao
    Yang, Runzhao
    Xiao, Tingxiong
    Li, Zongren
    Suo, Jinli
    He, Kunlun
    Dai, Qionghai
    arXiv, 2023,
  • [26] CUTS: NEURAL CAUSAL DISCOVERY FROM IRREGULAR TIME-SERIES DATA
    Cheng, Yuxiao
    Yang, Runzhao
    Xiao, Tingxiong
    Li, Zongren
    Suo, Jinli
    He, Kunlun
    Dai, Qionghai
    11th International Conference on Learning Representations, ICLR 2023, 2023,
  • [27] SIMULATION AND MODELING OF DATA ACQUISITION-SYSTEMS FOR FUTURE HIGH-ENERGY PHYSICS EXPERIMENTS
    BOOTH, A
    BLACK, D
    WALSH, D
    BOWDEN, M
    BARSOTTI, E
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1991, 38 (02) : 316 - 321
  • [28] Data Preprocessing for ANN-based Industrial Time-Series Forecasting with Imbalanced Data
    Pisa, Ivan
    Santin, Ignacio
    Lopez Vicario, Jose
    Morell, Antoni
    Vilanova, Ramon
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [29] Using Time-Series Databases for Energy Data Infrastructures
    Hadjichristofi, Christos
    Diochnos, Spyridon
    Andresakis, Kyriakos
    Vescoukis, Vassilios
    ENERGIES, 2024, 17 (21)
  • [30] DATA-ANALYSIS TECHNIQUES IN HIGH-ENERGY PHYSICS
    JOBES, M
    SHAYLOR, HR
    REPORTS ON PROGRESS IN PHYSICS, 1972, 35 (10) : 1077 - &