Towards understanding the role of content-based and contextualized features in detecting abuse on Twitter

被引:0
|
作者
Hussain, Kamal [1 ,5 ]
Saeed, Zafar [1 ,2 ,5 ]
Abbasi, Rabeeh [1 ,3 ,5 ]
Sindhu, Muddassar [1 ,3 ,5 ]
Khattak, Akmal [1 ,3 ,5 ]
Arafat, Sachi [1 ,4 ,5 ]
Daud, Ali [1 ,5 ]
Mushtaq, Mubashar [1 ,5 ,6 ]
机构
[1] Univ Lisbon, Inst Super Tecn, Lisbon, Portugal
[2] Univ Bari, Dipartimento Informat, Bari, Italy
[3] Quaid i Azam Univ, Dept Comp Sci, Islamabad, Pakistan
[4] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah, Saudi Arabia
[5] Rabdan Acad, Fac Resilience, Abu Dhabi, U Arab Emirates
[6] Forman Christian Coll, Dept Comp Sci, Lahore, Pakistan
关键词
Abuse; Context; Machine learning; Social media; Twitter; EVENT DETECTION; HEARTBEAT GRAPH; HATE SPEECH; ANT LION; ALGORITHM;
D O I
10.1016/j.heliyon.2024.e29593
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents a novel approach for detecting abuse on Twitter. Abusive posts have become a major problem for social media platforms like Twitter. It is important to identify abuse to mitigate its potential harm. Many researchers have proposed methods to detect abuse on Twitter. However, most of the existing approaches for detecting abuse look only at the content of the abusive tweet in isolation and do not consider its contextual information, particularly the tweets posted before the abusive tweet. In this paper, we propose a new method for detecting abuse that uses contextual information from the tweets that precede and follow the abusive tweet. We hypothesize that this contextual information can be used to better understand the intent of the abusive tweet and to identify abuse that content -based methods would otherwise miss. We performed extensive experiments to identify the best combination of features and machine learning algorithms to detect abuse on Twitter. We test eight different machine learning classifiers on content- and context -based features for the experiments. The proposed method is compared with existing abuse detection methods and achieves an absolute improvement of around 7%. The best results are obtained by combining the content and context -based features. The highest accuracy of the proposed method is 86%, whereas the existing methods used for comparison have highest accuracy of 79.2%.
引用
收藏
页数:17
相关论文
共 27 条
  • [1] Multi-label Emotion Classification using Content-Based Features in Twitter
    Ameer, Iqra
    Ashraf, Noman
    Sidorov, Grigori
    Gomez-Adorno, Helena
    COMPUTACION Y SISTEMAS, 2020, 24 (03): : 1159 - 1164
  • [2] Understanding the role of firm-generated content by hotel segment: the case of Twitter
    Kim, Woo-Hyuk
    Park, Eunhye
    Kim, Sung-Bum
    CURRENT ISSUES IN TOURISM, 2023, 26 (01) : 122 - 136
  • [3] Twitter-User Recommender System using Tweets: A Content-based Approach
    Nidhi, R. H.
    Annappa, B.
    2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,
  • [4] Detecting life events from twitter based on temporal semantic features
    Khodabakhsh, Maryam
    Kahani, Mohsen
    Bagheri, Ebrahim
    Noorian, Zeinab
    KNOWLEDGE-BASED SYSTEMS, 2018, 148 : 1 - 16
  • [5] A Multi-view Content-Based User Recommendation Scheme for Following Users in Twitter
    Chechev, Milen
    Georgiev, Petko
    SOCIAL INFORMATICS, SOCINFO 2012, 2012, 7710 : 434 - 447
  • [6] Spammer Classification Using Ensemble Methods over Content-Based Features
    Makkar, Aaisha
    Goel, Shivani
    PROCEEDINGS OF SIXTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING, SOCPROS 2016, VOL 2, 2017, 547 : 1 - 9
  • [7] Networking Impact of a Local-Aware Content-Based Delivery for Twitter-Like Applications
    Truong, Patrick
    Mathieu, Bertrand
    Peltier, Jean-Francois
    2015 8TH INTERNATIONAL CONFERENCE ON INTELLIGENCE IN NEXT GENERATION NETWORKS, 2015, : 224 - 230
  • [8] Real-time Twitter Content Polluter Detection Based on Direct Features
    Chen, Weiling
    Yeo, Chai Kiat
    Lau, Chiew Tong
    Lee, Bu Sung
    2015 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND SECURITY (ICISS), 2015, : 240 - 243
  • [9] Content-based medical image retrieval using fractional Hartley transform with hybrid features
    Rani, K. Vijila
    Prince, M. Eugine
    Therese, P. Sujatha
    Shermila, P. Josephin
    Devi, E. Anna
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) : 27217 - 27242
  • [10] Content-Based Image Retrieval Using Error Diffusion Block Truncation Coding Features
    Guo, Jing-Ming
    Prasetyo, Heri
    Chen, Jen-Ho
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (03) : 466 - 481