A Comparative Text Classification Study with Deep Learning-Based Algorithms

被引:8
作者
Koksal, Omer [1 ]
Akgul, Ozlem [2 ]
机构
[1] ASELSAN, Artificial Intelligence & Informat Technol Dept, Ankara, Turkey
[2] Middle East Tech Univ, Elect & Elect Engn Dept, Ankara, Turkey
来源
2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022) | 2022年
关键词
text classification; deep learning; convolutional neural network; recurrent neural network; LSTM; GRU;
D O I
10.1109/ICEEE55327.2022.9772587
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As a well-known Natural Language Processing (NLP) task, text classification can be defined as the process of categorizing documents depending on their content. In this process, selecting classification algorithms and tuning classification parameters are crucial for efficient classification. In recent years, many deep learning algorithms have been used successfully in text classification tasks. This paper performed a comparative study utilizing and optimizing several deep learning-based algorithms. We have implemented deep neural networks (DNN), convolutional neural networks (CNN), long shortest-term memory (LSTM), and gated recurrent units (GRU). In addition, we performed extensive experiments by tuning hyperparameters to improve classification accuracy. In addition, we implemented word embeddings techniques to acquire feature vectors of text data. Then we compared our word embeddings results with traditional TF-IDF vectorization results of DNN and CNN. In our experiments, we used an open-source Turkish News benchmarking dataset to compare our results with previous studies in the literature. Our experimental results revealed significant improvements in classification performance using word embeddings with deep learning-based algorithms and tuning hyperparameters. Furthermore, our work outperformed previous results on the selected dataset.
引用
收藏
页码:387 / 391
页数:5
相关论文
共 23 条
[1]  
Aci C., 2019, Bilisim Teknolojileri Dergisi, V12, P219, DOI DOI 10.17671/GAZIBTD.457917
[2]  
Akin A. A, ahmetaa/zemberek-nlp: NLP tools for Turkish
[3]   Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification [J].
Aydogan, Murat ;
Karci, Ali .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 541
[4]  
Bojanowski P., 2017, Transactions of the association for computational linguistics, V5, P135, DOI [10.1162/tacl_a_00051, 10.1162/tacla00051, DOI 10.1162/TACL_A_00051]
[5]   Development of majority vote ensemble feature selection algorithm augmented with rank allocation to enhance Turkish text categorization [J].
Borandag, Emin ;
Ozcift, Akin ;
Kaygusuz, Yesim .
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (02) :514-530
[6]  
Celik O., DOKUZ EYLUL UNIVERSI, V23, P121
[7]  
Chung Junyoung, 2014, ARXIV
[8]  
Dogru Hasibe Busra, 2021, 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA), P91, DOI 10.1109/CAIDA51941.2021.9425290
[9]  
Erdinc H. Y., 2019, SIG PROCESS COMMUN
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]