A hybrid machine learning-based model for predicting flight delay through aviation big data

被引:7
作者
Dai, Min [1 ]
机构
[1] Civil Aviat Flight Univ China, CAAC Acad, Guanghan 618307, Peoples R China
关键词
Machine learning; Big data; Aviation data; Flight delay prediction;
D O I
10.1038/s41598-024-55217-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The prediction of flight delays is one of the important and challenging issues in the field of scheduling and planning flights by airports and airlines. Therefore, in recent years, we have witnessed various methods to solve this problem using machine learning techniques. In this article, a new method is proposed to address these issues. In the proposed method, a group of potential indicators related to flight delay is introduced, and a combination of ANOVA and the Forward Sequential Feature Selection (FSFS) algorithm is used to determine the most influential indicators on flight delays. To overcome the challenges related to large flight data volumes, a clustering strategy based on the DBSCAN algorithm is employed. In this approach, samples are clustered into similar groups, and a separate learning model is used to predict flight delays for each group. This strategy allows the problem to be decomposed into smaller sub-problems, leading to improved prediction system performance in terms of accuracy (by 2.49%) and processing speed (by 39.17%). The learning model used in each cluster is a novel structure based on a random forest, where each tree component is optimized and weighted using the Coyote Optimization Algorithm (COA). Optimizing the structure of each tree component and assigning weighted values to them results in a minimum 5.3% increase in accuracy compared to the conventional random forest model. The performance of the proposed method in predicting flight delays is tested and compared with previous research. The findings demonstrate that the proposed approach achieves an average accuracy of 97.2% which indicates a 4.7% improvement compared to previous efforts.
引用
收藏
页数:16
相关论文
共 31 条
[1]   A geographical and operational deep graph convolutional approach for flight delay prediction [J].
Cai, Kaiquan ;
LI, Yue ;
Zhu, Yongwen ;
Fang, Quan ;
Yang, Yang ;
DU, Wenbo .
CHINESE JOURNAL OF AERONAUTICS, 2023, 36 (03) :357-367
[2]   On the relevance of data science for flight delay research: a systematic review [J].
Carvalho, Leonardo ;
Sternberg, Alice ;
Maia Goncalves, Leandro ;
Beatriz Cruz, Ana ;
Soares, Jorge A. ;
Brandao, Diego ;
Carvalho, Diego ;
Ogasawara, Eduardo .
TRANSPORT REVIEWS, 2021, 41 (04) :499-528
[3]   The one-way ANOVA test explained [J].
Chatzi, Anna ;
Doody, Owen .
NURSE RESEARCHER, 2023, 31 (03) :8-14
[4]   The Application of Improved Grasshopper Optimization Algorithm to Flight Delay Prediction-Based on Spark [J].
Chen, Hongwei ;
Tu, Shenghong ;
Xu, Hui .
COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, CISIS-2021, 2021, 278 :80-89
[5]   A Flow Feedback Traffic Prediction Based on Visual Quantified Features [J].
Chen, Jing ;
Xu, Mengqi ;
Xu, Wenqiang ;
Li, Daping ;
Peng, Weimin ;
Xu, Haitao .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) :10067-10075
[6]   Applications of smart technologies in logistics and transport: A review [J].
Chung, Sai-Ho .
TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2021, 153
[7]   A Survey on Artificial Intelligence (AI) and eXplainable AI in Air Traffic Management: Current Trends and Development with Future Research Trajectory [J].
Degas, Augustin ;
Islam, Mir Riyanul ;
Hurter, Christophe ;
Barua, Shaibal ;
Rahman, Hamidur ;
Poudel, Minesh ;
Ruscio, Daniele ;
Ahmed, Mobyen Uddin ;
Begum, Shahina ;
Rahman, Md Aquif ;
Bonelli, Stefano ;
Cartocci, Giulia ;
Di Flumeri, Gianluca ;
Borghini, Gianluca ;
Babiloni, Fabio ;
Arico, Pietro .
APPLIED SCIENCES-BASEL, 2022, 12 (03)
[8]  
Dingsheng Deng, 2020, 2020 7th International Forum on Electrical Engineering and Automation (IFEEA), P949, DOI 10.1109/IFEEA51475.2020.00199
[9]  
Dudek A., 2020, Classification and Data Analysis, Proceedings of the SKAD 2019, Szczecin, Poland, 18-20 September 2019, P19, DOI DOI 10.1007/978-3-030-52348-02
[10]  
Goutte C, 2005, LECT NOTES COMPUT SC, V3408, P345