Automating Maritime Risk Data Collection and Identification Leveraging Large Language Models

被引:3
作者
Huang, Donghao [1 ,2 ]
Fu, Xiuju [3 ]
Yin, Xiaofeng [3 ]
Pen, Haibo [4 ]
Wang, Zhaoxia [1 ]
机构
[1] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
[2] Mastercard, Res & Dev, Singapore, Singapore
[3] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore, Singapore
[4] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW | 2024年
关键词
Maritime Risk; Automated Data Collection; Risk Identification; Large Language Models; Traditional Machine Learning; GPT-4o; Llama-3.1; Ratio of Valid Categories (RVC);
D O I
10.1109/ICDMW65004.2024.00061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Maritime risk research is crucial yet challenging for improving safety, efficiency, and sustainability in maritime operations. This paper presents an innovative method for automating the collection and identification of risk data related to global maritime risks from news sources, addressing the limitations of traditional manual methods. To evaluate the proposed method, different learning-based models, including conventional machine learning approaches and advanced Large Language Models (LLMs) such as GPT-4 and LLaMA-3.1, are comprehensively studied for comparison. In addition, not only do we use popular evaluation metrics to assess the proposed method, but we also introduce a new evaluation metric, called the "Ratio of Valid Categories (RVC)," to evaluate model reliability. The merits of the proposed method are demonstrated across different evaluation metrics. The research results show that the proposed LLM-based methods, particularly the GPT-4-based method, consistently outperform traditional models, significantly improving both the efficiency and accuracy of maritime risk data collection and identification. Our findings contribute to the expanding literature on LLM applications in risk management, demonstrating their potential to transform data collection and identification practices.
引用
收藏
页码:433 / 439
页数:7
相关论文
共 25 条
[1]   Will Affective Computing Emerge From Foundation Models and General Artificial Intelligence? A First Evaluation of ChatGPT [J].
Amin, Mostafa ;
Cambria, Erik W. ;
Schuller, Bjorn .
IEEE INTELLIGENT SYSTEMS, 2023, 38 (02) :15-23
[2]  
Basyal L, 2023, Arxiv, DOI arXiv:2310.10449
[3]  
Brown TB, 2020, ADV NEUR IN, V33
[4]  
Cambria E., 2024, P INT C HUM COMP INT
[5]   Seven Pillars for the Future of Artificial Intelligence [J].
Cambria, Erik ;
Mao, Rui ;
Chen, Melvin ;
Wang, Zhaoxia ;
Ho, Seng-Beng .
IEEE INTELLIGENT SYSTEMS, 2023, 38 (06) :62-69
[6]  
Cambria E, 2013, PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE FOR HUMAN-LIKE INTELLIGENCE (CIHLI), P108, DOI 10.1109/CIHLI.2013.6613272
[7]   Statistical Approaches to Concept-Level Sentiment Analysis Introduction [J].
Cambria, Erik ;
Schuller, Bjoern ;
Liu, Bing ;
Wang, Haixun ;
Havasi, Catherine .
IEEE INTELLIGENT SYSTEMS, 2013, 28 (03) :6-9
[8]  
Cambria Erik, 2024, Understanding Natural Language Understanding
[9]  
Chae Y., 2023, Large language models for text classification: from zero-shot learning to fine-tuning
[10]   Learning word dependencies in text by means of a deep recurrent belief network [J].
Chaturvedi, Iti ;
Ong, Yew-Soon ;
Tsang, Ivor W. ;
Welsch, Roy E. ;
Cambria, Erik .
KNOWLEDGE-BASED SYSTEMS, 2016, 108 :144-154