Audio-Based Hate Speech Classification from Online Short-Form Videos

被引:4
作者
Ibanez, Michael [1 ]
Sapinit, Ranz [1 ]
Reyes, Lloyd Antonie [1 ]
Hussien, Mohammed [1 ]
Imperial, Joseph Marvin [1 ]
Rodriguez, Ramon [1 ]
机构
[1] Natl Univ, Manila, Philippines
来源
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP) | 2021年
关键词
hate speech; tiktok; audio classification; machine learning; speech processing;
D O I
10.1109/IALP54817.2021.9675250
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we pioneer the development of an audio-based hate speech classifier from online, short-form TikTok videos using traditional machine learning algorithms such as Logistic Regression, Random Forest, and Support Vector Machines. We scraped over 4746 videos using the TikTok API tool and extracted audio-based features such as MFCCs, Spectral Centroid, Rolloff, Bandwidth, Zero-Crossing Rate, and Chroma values as primary feature sets. Results show that using the extracted predictors for hate speech detection can obtain up to 78.5% accuracy on an optimized Random Forest model, crossing the 50% benchmark for models in this task. In addition, comparing the Information Gain scores and globally learned model weights identified that Spectral Rolloff and MFCCs are top predictors in discriminating hate speech for the Filipino language.
引用
收藏
页码:72 / 77
页数:6
相关论文
共 47 条
  • [1] Audio-based description and structuring of videos
    Harb H.
    Chen L.
    International Journal on Digital Libraries, 2006, 6 (1) : 70 - 81
  • [2] Exploring user engagement behavior with short-form video advertising on short-form video platforms: a visual-audio perspective
    Xiao, Lin
    Li, Xiaofeng
    Mou, Jian
    INTERNET RESEARCH, 2024,
  • [3] Audio-Based Semantic Concept Classification for Consumer Video
    Lee, Keansub
    Ellis, Daniel P. W.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1406 - 1416
  • [4] Hate speech and hate-based harassment in online games
    Wells, Garrison
    Romhanyi, Agnes
    Steinkuehler, Constance
    FRONTIERS IN PSYCHOLOGY, 2025, 15
  • [5] Speech/Music Classification of Short Audio Segments
    Hirvonen, Toni
    2014 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2014, : 135 - 138
  • [6] An Audio-based Intelligent Fault Classification System for Belt Conveyor Rollers
    Yang, Mingjin
    Peng, Chen
    Li, Zhipeng
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4647 - 4652
  • [7] Driving Factors and Moderating Effects Behind Citizen Engagement With Mobile Short-Form Videos
    Zhang, Cevin
    Zheng, Hemingxi
    Wang, Qing
    IEEE ACCESS, 2022, 10 : 40999 - 41009
  • [8] A Large-Scale UAV Audio Dataset and Audio-Based UAV Classification Using CNN
    Wang, Yaqin
    Chu, Zhiwei
    Ku, Ilmun
    Smith, E. Cho
    Matson, Eric T.
    2022 SIXTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC, 2022, : 186 - 189
  • [9] Towards safer online communities: Deep learning and explainable AI for hate speech detection and classification
    Kibriya, Hareem
    Siddiqa, Ayesha
    Khan, Wazir Zada
    Khan, Muhammad Khurram
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 116
  • [10] From hate speech to HateLess. The effectiveness of a prevention program on adolescents' online hate speech involvement
    Wachs, Sebastian
    Wright, Michelle F.
    Gamez-Guadix, Manuel
    COMPUTERS IN HUMAN BEHAVIOR, 2024, 157