Enhance Social Network Bullying Detection Using Multi-Teacher Knowledge Distillation With XGBoost Classifier

被引:0
作者
Prasomphan, Sathit [1 ]
机构
[1] King Mongkuts Univ Technol North Bangkok, Fac Appl Sci, Dept Comp & Informat Sci, Bangkok 10800, Thailand
关键词
Cyberbullying; multi-teacher model; knowledge distillation; soft targets; student model; XGBoost Classifier; SENTIMENT ANALYSIS;
D O I
10.1109/ACCESS.2025.3574679
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cyberbullying remains a pressing issue in Thai social media, especially among teenagers. While many studies have explored deep learning approaches for sentiment analysis or toxicity detection, the detection of cyberbullying-especially in the Thai language-remains underexplored. This study introduces a novel framework that enhances cyberbullying detection by integrating Multi-Teacher Knowledge Distillation (MTKD) with an XGBoost classifier, specifically adapted for Thai-language social media posts. Unlike prior work that relies solely on neural models, this research demonstrates how distilled soft labels from diverse teacher models can be effectively transferred to a lightweight and interpretable XGBoost student model. A key contribution of this study is the successful adaptation of XGBoost, traditionally used for structured/tabular data, for a natural language classification task by using rich semantic features extracted via pre-trained NLP models. Additionally, although the selected datasets (Wisesight, Thai Toxic Tweet, and 40 Thai Children Stories) are often used for sentiment analysis, we reframe and preprocess them for the purpose of cyberbullying classification by focusing on toxic, harmful, or aggressive linguistic patterns. Our framework achieved strong classification performance-92.5%, 90.5%, and 91.0% accuracy across the three datasets-demonstrating its robustness and practical application in Thai-language cyberbullying detection.
引用
收藏
页码:95618 / 95627
页数:10
相关论文
共 21 条
[1]  
Alam Kazi Saeed, 2021, Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2020), P710, DOI 10.1109/ICICV50876.2021.9388499
[2]  
Alhloul A., 2023, SSRN Electron. J., P1, DOI [10.2139/ssrn.4338998, DOI 10.2139/SSRN.4338998]
[3]   Thai Defamatory Text Classification on Social Media [J].
Arreerard, Ratchakrit ;
Senivongse, Twittie .
2018 IEEE/ACIS 3RD INTERNATIONAL CONFERENCE ON BIG DATA, CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (BCD 2018), 2018, :73-78
[4]  
Bact', 2019, Zenodo, DOI 10.5281/ZENODO.3457447
[5]   Intelligent Recommendations Based on COVID-19 Related Twitter Sentiment Analysis and Fake Tweet Detection in Apache Spark Environment [J].
Badawi, Dima .
IETE JOURNAL OF RESEARCH, 2024, 70 (05) :4965-4988
[6]  
Deng L., 2014, Signal Process., V7, P197
[7]   Cyberbullying Detection on Twitter Using Deep Learning-Based Attention Mechanisms and Continuous Bag of Words Feature Extraction [J].
Fati, Suliman Mohamed ;
Muneer, Amgad ;
Alwadain, Ayed ;
Balogun, Abdullateef O. .
MATHEMATICS, 2023, 11 (16)
[8]   RoBERTaNET: Enhanced RoBERTa Transformer Based Model for Cyberbullying Detection With GloVe Features [J].
Jamjoom, Arwa A. ;
Karamti, Hanen ;
Umer, Muhammad ;
Alsubai, Shtwai ;
Kim, Tai-Hoon ;
Ashraf, Imran .
IEEE ACCESS, 2024, 12 :58950-58959
[9]  
Mehendale N., 2022, SSRN Electron. J., V5, P1
[10]   Thai sentiment analysis with deep learning techniques: A comparative study based on word embedding, POS-tag, and sentic features [J].
Pasupa, Kitsuchart ;
Ayutthaya, Thititorn Seneewong Na .
SUSTAINABLE CITIES AND SOCIETY, 2019, 50