Position-context additive transformer-based model for classifying text data on social media

被引:0
作者
Abd-Elaziz, M. M. [1 ]
El-Rashidy, Nora [2 ]
Abou Elfetouh, Ahmed [1 ]
El-Bakry, Hazem M. [1 ]
机构
[1] Mansoura Univ, Fac Comp & Informat Sci, Informat Syst Dept, Mansoura, Egypt
[2] Kaferelshikh Univ, Fac Artificial Intelligence, Machine Learning & Informat Retrieval Dept, Kafr Al Sheikh, Egypt
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
Social media; Transformer-based model; Word embedding; Bi-LSTM network; Additive attention; NEURAL-NETWORKS;
D O I
10.1038/s41598-025-90738-1
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In recent years, the continuous increase in the growth of text data on social media has been a major reason to rely on the pre-training method to develop new text classification models specially transformer-based models that have proven worthwhile in most natural language processing tasks. This paper introduces a new Position-Context Additive transformer-based model (PCA model) that consists of two-phases to increase the accuracy of text classification tasks on social media. Phase I aims to develop a new way to extract text characteristics by paying attention to the position and context of each word in the input layer. This is done by integrating the improved word embedding method (the position) with the developed Bi-LSTM network to increase the focus on the connection of each word with the other words around it (the context). As for phase II, it focuses on the development of a transformer-based model based primarily on improving the additive attention mechanism. The PCA model has been tested for the implementation of the classification of health-related social media texts in 6 data sets. Results showed that performance accuracy was improved by an increase in F1-Score between 0.2 and 10.2% in five datasets compared to the best published results. On the other hand, the performance of PCA model was compared with three transformer-based models that proved high accuracy in classifying texts, and experiments also showed that PCA model overcame the other models in 4 datasets to achieve an improvement in F1-score between 0.1 and 2.1%. The results also led us to conclude a direct correlation between the volume of training data and the accuracy of performance as the increase in the volume of training data positively affects F1-Score improvement.
引用
收藏
页数:11
相关论文
共 39 条
[21]   Estimation of User Location and Local Topics Based on Geo-tagged Text Data on Social Media [J].
Ishida, Kazunari .
2015 IIAI 4TH INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2015, :14-17
[22]   A study on predictive modeling of users’ parasocial relationship types based on social media text big data [J].
Meng J. ;
Chen Y. .
International Journal of Circuits, Systems and Signal Processing, 2022, 16 :171-180
[23]   The Application of Big Data Technology in the Educational Model of Psychological Parenting in Colleges and Universities in the Context of Social Media [J].
Li W. .
Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
[24]   Testing an extended model of consumer behavior in the context of social media-based brand communities [J].
Habibi, Mohammad Reza ;
Laroche, Michel ;
Richard, Marie-Odile .
COMPUTERS IN HUMAN BEHAVIOR, 2016, 62 :292-302
[25]   Predicting Generalized Anxiety Disorder From Impromptu Speech Transcripts Using Context-Aware Transformer-Based Neural Networks: Model Evaluation Study [J].
Teferra, Bazen Gashaw ;
Rose, Jonathan .
JMIR MENTAL HEALTH, 2023, 10
[26]   A Text Data Mining-Based Digital Transformation Opinion Thematic System for Online Social Media Platforms [J].
Liao, Haihan ;
Wang, Chengmin ;
Gu, Yanzhang ;
Liu, Renhuai .
SYSTEMS, 2025, 13 (03)
[27]   Fulmqa: a fuzzy logic-based model for social media data quality assessment [J].
Reda, Oumaima ;
Zellou, Ahmed .
SOCIAL NETWORK ANALYSIS AND MINING, 2023, 13 (01)
[28]   Fulmqa: a fuzzy logic-based model for social media data quality assessment [J].
Oumaima Reda ;
Ahmed Zellou .
Social Network Analysis and Mining, 13
[29]   Probabilistic Topic Model based Approach for Detecting Bursty Events from Social Media Data [J].
Li, Chunshan ;
Chu, Dianhui .
2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, :701-706
[30]   Designing an intelligent push model for user emotional topics based on dynamic text categorization in social media news dissemination [J].
Wang, Jixuan .
PEERJ COMPUTER SCIENCE, 2024, 10 :1-20