Ensemble Deep Learning for Multilabel Binary Classification of User-Generated Content

被引:36
|
作者
Haralabopoulos, Giannis [1 ]
Anagnostopoulos, Ioannis [2 ]
McAuley, Derek [1 ]
机构
[1] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England
[2] Univ Thessaly, Dept Comp Sci & Biomed Informat, Lamia 35131, Greece
基金
英国工程与自然科学研究理事会;
关键词
ensemble learning; sentiment analysis; multilabel classification; deep neural networks; pure emotion; Semeval; 2018; Task; 1; toxic comment classification; SENTIMENT ANALYSIS; DIFFERENTIAL EVOLUTION; NEURAL-NETWORKS;
D O I
10.3390/a13040083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis usually refers to the analysis of human-generated content via a polarity filter. Affective computing deals with the exact emotions conveyed through information. Emotional information most frequently cannot be accurately described by a single emotion class. Multilabel classifiers can categorize human-generated content in multiple emotional classes. Ensemble learning can improve the statistical, computational and representation aspects of such classifiers. We present a baseline stacked ensemble and propose a weighted ensemble. Our proposed weighted ensemble can use multiple classifiers to improve classification results without hyperparameter tuning or data overfitting. We evaluate our ensemble models with two datasets. The first dataset is from Semeval2018-Task 1 and contains almost 7000 Tweets, labeled with 11 sentiment classes. The second dataset is the Toxic Comment Dataset with more than 150,000 comments, labeled with six different levels of abuse or harassment. Our results suggest that ensemble learning improves classification results by 1.5% to 5.4%.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Assessing the Quality of User-Generated Content
    Stefan Winkler
    ZTE Communications, 2013, 11 (01) : 37 - 40
  • [22] The future of user-generated content is now
    Marino, Gregoire
    JOURNAL OF INTELLECTUAL PROPERTY LAW & PRACTICE, 2013, 8 (03) : 183 - 183
  • [23] Principles for Modeling User-Generated Content
    Lukyanenko, Roman
    Parsons, Jeffrey
    CONCEPTUAL MODELING, ER 2015, 2015, 9381 : 432 - 440
  • [24] A Solution for Navigating User-Generated Content
    Uusitalo, Severi
    Eskolin, Peter
    Belimpasakis, Petros
    2009 8TH IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY - SCIENCE AND TECHNOLOGY, 2009, : 219 - 220
  • [25] Generative AI in User-Generated Content
    Hua, Yiqing
    Niu, Shuo
    Cai, Jie
    Chilton, Lydia B.
    Heuer, Hendrik
    Wohn, Donghee Yvette
    EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,
  • [26] Extraversion as a stimulus for user-generated content
    Pagani, Margherita
    Goldsmith, Ronald E.
    Hofacker, Charles F.
    JOURNAL OF RESEARCH IN INTERACTIVE MARKETING, 2013, 7 (04) : 242 - 256
  • [27] Editorial: Online User Behavior and User-Generated Content
    Saura, Jose Ramon
    Dwivedi, Yogesh K.
    Palacios-Marques, Daniel
    FRONTIERS IN PSYCHOLOGY, 2022, 13
  • [28] Identifying Privacy Leakage from User-Generated Content in An Online Health Community - A deep learning approach
    Zhu, Yushan
    Tong, Xing
    Wang, Xi
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 407 - 408
  • [29] Midwifery learning and forecasting: Predicting content demand with user-generated logs
    Guitart, Anna
    Fernandez del Rio, Ana
    Perianez, Africa
    Bellhouse, Lauren
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 138
  • [30] A Supervised Machine Learning Approach for the Credibility Assessment of User-Generated Content
    Jain, Praphula Kumar
    Pamula, Rajendra
    Ansari, Sarfraj
    WIRELESS PERSONAL COMMUNICATIONS, 2021, 118 (04) : 2469 - 2485