Modified Convolutional Neural Network Filter Gate for Social Media Text Classification

被引：0

作者：

Suhaimi, Nur Suhailayani ^{[1
,2
]}

Othman, Zalinda ^{[3
]}

Yaakub, Mohd Ridzwan ^{[3
]}

机构：

[1] Univ Teknol MARA, Shah Alam, Malaysia

[2] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia

[3] Univ Kebangsaan Malaysia, Bangi, Malaysia

来源：

INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY | 2022年 / 22卷 / 05期

关键词：

Text pre-processing; text input filtering; convolutional neural network; multi-gate text filtering;

D O I：

10.22937/IJCSNS.2022.22.5.86

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The capacity of the Convolution Neural Network (ConvNet) to handle unpredictable and continuous stream input has piqued researchers' attention in a variety of fields. One of ConvNet's features involves filtering during input reception. However, sampling filters alone lead to low pre-processing accuracy and precision problems. This work offers an upgraded Filter Gate with Text Pre-processing (FGTP) to address this pre-processing problem during input reception. We filtered the input using three algorithms: Fourier Transform (FT), Porter's Algorithm (PA), and Correction Filter (CF) for the continuous and uncertain size of text input data placed into the ConvNet algorithm for classification. We use the FT method to delete redundant or similar text to deal with duplication and repetitive text. We use the PA approach to stem and reduce out-of- vocabulary words in continuous sentences. After that, the CF algorithm deals with misspellings and typos. The filtered results in this paper are compared to random sampling filtering and word detection accuracy with and without FGTP. Finally, we compare the classification accuracy of ConvNet with FGTP to ConvNet with random sampling. This proposed method significantly contributes to proving the influence of multiple filtering of text pre-processing in text classification with a gap of over 27 per cent better than the conventional method. The proposed method yielded 83.4% accuracy, while conventional filtering provides a 65.33% accuracy value.

引用

页码：617 / 627

页数：11

共 33 条

[1]

Abu Bakar A, 2020, SAINS MALAYS, V49, P447, DOI [10.17576/jsm-2020-4902-24, 10.0000/17576/jsm-2020-4902-24]

[2] Feature Selection Algorithms for Malaysian Dengue Outbreak Detection Model [J].

Abuhamad, Husam I. S. ;

Abu Bakar, Azuraliza ;

Zainudin, Suhaila ;

Sahani, Mazrura ;

Ali, Zainudin Mohd .

SAINS MALAYSIANA, 2017, 46 (02) :255-265

[3]

Ahmad IS., 2020, SN Comput Sci, V1, P235, DOI [DOI 10.1007/S42979-020-00249-1, 10.1007/s42979-020-00249-1]

[4] Detection and classification of social media-based extremist affiliations using sentiment analysis techniques [J].

Ahmad, Shakeel ;

Asghar, Muhammad Zubair ;

Alotaibi, Fahad M. ;

Awan, Irfanullah .

HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2019, 9

[5]

Balasubramaniam Ramesh, 2019, INT J APPL ENG RES, V14, P2238

[6]

Bhardwaj A., 2020, P 2 INT C IOT SOC MO, P1

[7] Filtering Data Streams for Entity-Based Continuous Queries [J].

Cheng, Reynold ;

Kao, Ben C. M. ;

Kwan, Alan ;

Prabhakar, Sunil ;

Tu, Yi-Cheng .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (02) :234-248

[8] AON: Towards Arbitrarily-Oriented Text Recognition [J].

Cheng, Zhanzhan ;

Xu, Yangliu ;

Bai, Fan ;

Niu, Yi ;

Pu, Shiliang ;

Zhou, Shuigeng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5571-5579

[9]

Delakis M., 2018, INT J DOC ANAL RECOG

[10]

Dorfler M., 2017, J NEURAL COMPUTING A, V32, P12, DOI [10.1007/s00521-01803704-x, DOI 10.1007/S00521-01803704-X]

← 1 2 3 4 →