Text Preprocessing Approaches in CNN for Disaster Reports Dataset

被引:1
作者
Arisha, Andriansyah Oktafiandi [1 ]
Hazriani [1 ]
Wabula, Yuyun [1 ]
机构
[1] Handayani Univ, Dept Comp Syst, Makassar, Indonesia
来源
2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC | 2023年
关键词
Text Preprocessing; CNN; Disaster; automatic; semi-automatic;
D O I
10.1109/ICAIIC57133.2023.10067109
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study aims to compare the performance of the text-preprocessing methods namely automatic and semiautomatic preprocessing techniques in the CNN algorithm to carry out learning on disaster report dataset. The experimental results on the disaster dataset with a total of 200 records with the automatic text preprocessing technique produce an average accuracy of 0.81 and 1 with training data of 80:20 and 90:10. While in the optimize model that is semi-automatic text preprocessing approach (which is the author's proposed approach), the average accuracy obtained are 0.95 and 1 for dataset 80:20 and 90:10. The experimental results conclude that cleaning the dataset with the semi- automatic text preprocessing model can improve accuracy compared to the previous model. The proposed model will get convergence with 80:20 training data at epoch 20, batch size 5 and random state 34, while for dataset 90:10 the best convergence value at epoch 20-30.
引用
收藏
页码:216 / 220
页数:5
相关论文
共 31 条
[1]  
Alhadidi B., 2008, HYBRID STOP WORD REM, V30
[2]   Determination of quality television programmes based on sentiment analysis on Twitter [J].
Amalia, A. ;
Oktinas, W. ;
Aulia, I ;
Rahmat, R. F. .
2ND INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2017, 2018, 978
[3]  
[Anonymous], About us
[4]  
[Anonymous], 15 OKC CLASSIFIER EF
[5]  
[Anonymous], Automatic classification of defective photovoltaic module cells in electroluminescence images - ScienceDirect Online
[6]  
Ashktorab Z., 2014, Tweedr: Mining twitter to inform disaster response
[7]  
Bala M. M., TEXT MINING REAL TIM
[8]   Sentiment analysis: Measuring opinions [J].
Bhadane, Chetashri ;
Dalal, Hardi ;
Doshi, Heenal .
INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING TECHNOLOGIES AND APPLICATIONS (ICACTA), 2015, 45 :808-814
[9]   Data Preprocessing Combination to Improve the Performance of Quality Classification in the Manufacturing Process [J].
Cho, Eunnuri ;
Chang, Tai-Woo ;
Hwang, Gyusun .
ELECTRONICS, 2022, 11 (03)
[10]  
Dang S., 2014, TEXT MINING TECHNIQU, V4