Towards understanding the role of content-based and contextualized features in detecting abuse on Twitter

被引：0

作者：

Hussain, Kamal ^{[1
,5
]}

Saeed, Zafar ^{[1
,2
,5
]}

Abbasi, Rabeeh ^{[1
,3
,5
]}

Sindhu, Muddassar ^{[1
,3
,5
]}

Khattak, Akmal ^{[1
,3
,5
]}

Arafat, Sachi ^{[1
,4
,5
]}

Daud, Ali ^{[1
,5
]}

Mushtaq, Mubashar ^{[1
,5
,6
]}

机构：

[1] Univ Lisbon, Inst Super Tecn, Lisbon, Portugal

[2] Univ Bari, Dipartimento Informat, Bari, Italy

[3] Quaid i Azam Univ, Dept Comp Sci, Islamabad, Pakistan

[4] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah, Saudi Arabia

[5] Rabdan Acad, Fac Resilience, Abu Dhabi, U Arab Emirates

[6] Forman Christian Coll, Dept Comp Sci, Lahore, Pakistan

来源：

HELIYON | 2024年 / 10卷 / 08期

关键词：

Abuse; Context; Machine learning; Social media; Twitter; EVENT DETECTION; HEARTBEAT GRAPH; HATE SPEECH; ANT LION; ALGORITHM;

D O I：

10.1016/j.heliyon.2024.e29593

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

This paper presents a novel approach for detecting abuse on Twitter. Abusive posts have become a major problem for social media platforms like Twitter. It is important to identify abuse to mitigate its potential harm. Many researchers have proposed methods to detect abuse on Twitter. However, most of the existing approaches for detecting abuse look only at the content of the abusive tweet in isolation and do not consider its contextual information, particularly the tweets posted before the abusive tweet. In this paper, we propose a new method for detecting abuse that uses contextual information from the tweets that precede and follow the abusive tweet. We hypothesize that this contextual information can be used to better understand the intent of the abusive tweet and to identify abuse that content -based methods would otherwise miss. We performed extensive experiments to identify the best combination of features and machine learning algorithms to detect abuse on Twitter. We test eight different machine learning classifiers on content- and context -based features for the experiments. The proposed method is compared with existing abuse detection methods and achieves an absolute improvement of around 7%. The best results are obtained by combining the content and context -based features. The highest accuracy of the proposed method is 86%, whereas the existing methods used for comparison have highest accuracy of 79.2%.

引用

页数：17

共 27 条

[21] Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature
Fu, Lawrence D.
Aliferis, Constantin F.
SCIENTOMETRICS, 2010, 85 (01) : 257 - 270
[22] Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature
Fu L.D.
Aliferis C.F.
Scientometrics, 2010, 85 (1) : 257 - 270
[23] A novel content-based image retrieval approach for classification using GLCM features and texture fused LBP variants
Garg, Meenakshi
Dhiman, Gaurav
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (04) : 1311 - 1328
[24] Content-based hierarchical document organization using multi-layer hybrid network and tree-structured features
Rahman, M. K. M.
Chow, Tommy W. S.
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) : 2874 - 2881
[25] Mapping Twitter hate speech towards social and sexual minorities: a lexicon-based approach to semantic content analysis*
Lingiardi, Vittorio
Carone, Nicola
Semeraro, Giovanni
Musto, Cataldo
D'Amico, Marilisa
Brena, Silvia
BEHAVIOUR & INFORMATION TECHNOLOGY, 2020, 39 (07) : 711 - 721
[26] Understanding the Effects of Personalized Recommender Systems on Political News Perceptions: A Comparison of Content-Based, Collaborative, and Editorial Choice-Based News Recommender System
Liao, Mengqi
JOURNAL OF BROADCASTING & ELECTRONIC MEDIA, 2023, 67 (03) : 294 - 322
[27] A comprehensive review of content-based image retrieval systems using deep learning and hand-crafted features in medical imaging: Research challenges and future directions
Vishraj, Rashmi
Gupta, Savita
Singh, Sukhwinder
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104

← 1 2 3 →