Natural language processing-based approach for automatically coding ship sensor data

被引:1
作者
Kim, Yunhui [1 ]
Park, Kwangphil [2 ]
Yoo, Byeongwoo [2 ]
机构
[1] Chungnam Natl Univ, Dept Naval Architecture & Ocean Engn, Daejeon, South Korea
[2] Chungnam Natl Univ, Dept Autonomous Vehicle Syst Engn, Daejeon, South Korea
关键词
Text classification; Natural language processing; TF; -IDF; Word embedding; KNN; SVM; WORD; DESIGN;
D O I
10.1016/j.ijnaoe.2023.100581
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
The digital transformation of ship systems requires the coding and management of large amounts of Input/ Output (IO) data generated by various pieces of equipment during ship operation. In this study, we investigated a method that recognizes the text of the IO description of a ship to automatically code IO data. Accordingly, the characteristics of the IO descriptions were extracted using Term Frequency-Inverse Document Frequency (TF-IDF) and word embedding, and machine learning techniques such as K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) and deep learning models such as Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and bidirectional LSTM (BiLSTM) were used to classify them into codes. Through the application of different text preprocessing techniques based on the unique characteristics of the data, the performances of the algorithms improved; the experimental results showed an accuracy of up to 91%, with an average improvement in accuracy of 5% for each algorithm.
引用
收藏
页数:12
相关论文
共 59 条
[1]  
Allahyari M, 2017, Arxiv, DOI arXiv:1707.02268
[2]   Impact of Stemming and Word Embedding on Deep Learning-Based Arabic Text Categorization [J].
Almuzaini, Huda Abdulrahman ;
Azmi, Aqil M. .
IEEE ACCESS, 2020, 8 :127913-127928
[3]  
Andrenucci A, 2005, THIRD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, P514
[4]   Statistical language model adaptation: review and perspectives [J].
Bellegarda, JR .
SPEECH COMMUNICATION, 2004, 42 (01) :93-108
[5]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[6]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[7]  
Chung JY, 2014, Arxiv, DOI [arXiv:1412.3555, DOI 10.48550/ARXIV.1412.3555]
[8]   Text categorization: past and present [J].
Dhar, Ankita ;
Mukherjee, Himadri ;
Dash, Niladri Sekhar ;
Roy, Kaushik .
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (04) :3007-3054
[9]   Design and control of hybrid power and propulsion systems for smart ships: A review of developments [J].
Geertsma, R. D. ;
Negenborn, R. R. ;
Visser, K. ;
Hopman, J. J. .
APPLIED ENERGY, 2017, 194 :30-54
[10]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]