Ensemble Deep Learning for Multilabel Binary Classification of User-Generated Content

被引：36

作者：

Haralabopoulos, Giannis ^{[1
]}

Anagnostopoulos, Ioannis ^{[2
]}

McAuley, Derek ^{[1
]}

机构：

[1] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England

[2] Univ Thessaly, Dept Comp Sci & Biomed Informat, Lamia 35131, Greece

来源：

ALGORITHMS | 2020年 / 13卷 / 04期

基金：

英国工程与自然科学研究理事会;

关键词：

ensemble learning; sentiment analysis; multilabel classification; deep neural networks; pure emotion; Semeval; 2018; Task; 1; toxic comment classification; SENTIMENT ANALYSIS; DIFFERENTIAL EVOLUTION; NEURAL-NETWORKS;

D O I：

10.3390/a13040083

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sentiment analysis usually refers to the analysis of human-generated content via a polarity filter. Affective computing deals with the exact emotions conveyed through information. Emotional information most frequently cannot be accurately described by a single emotion class. Multilabel classifiers can categorize human-generated content in multiple emotional classes. Ensemble learning can improve the statistical, computational and representation aspects of such classifiers. We present a baseline stacked ensemble and propose a weighted ensemble. Our proposed weighted ensemble can use multiple classifiers to improve classification results without hyperparameter tuning or data overfitting. We evaluate our ensemble models with two datasets. The first dataset is from Semeval2018-Task 1 and contains almost 7000 Tweets, labeled with 11 sentiment classes. The second dataset is the Toxic Comment Dataset with more than 150,000 comments, labeled with six different levels of abuse or harassment. Our results suggest that ensemble learning improves classification results by 1.5% to 5.4%.

引用

页数：14

共 50 条

[21] Assessing the Quality of User-Generated Content
Stefan Winkler
ZTE Communications, 2013, 11 (01) : 37 - 40
[22] The future of user-generated content is now
Marino, Gregoire
JOURNAL OF INTELLECTUAL PROPERTY LAW & PRACTICE, 2013, 8 (03) : 183 - 183
[23] Principles for Modeling User-Generated Content
Lukyanenko, Roman
Parsons, Jeffrey
CONCEPTUAL MODELING, ER 2015, 2015, 9381 : 432 - 440
[24] A Solution for Navigating User-Generated Content
Uusitalo, Severi
Eskolin, Peter
Belimpasakis, Petros
2009 8TH IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY - SCIENCE AND TECHNOLOGY, 2009, : 219 - 220
[25] Generative AI in User-Generated Content
Hua, Yiqing
Niu, Shuo
Cai, Jie
Chilton, Lydia B.
Heuer, Hendrik
Wohn, Donghee Yvette
EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,
[26] Extraversion as a stimulus for user-generated content
Pagani, Margherita
Goldsmith, Ronald E.
Hofacker, Charles F.
JOURNAL OF RESEARCH IN INTERACTIVE MARKETING, 2013, 7 (04) : 242 - 256
[27] Editorial: Online User Behavior and User-Generated Content
Saura, Jose Ramon
Dwivedi, Yogesh K.
Palacios-Marques, Daniel
FRONTIERS IN PSYCHOLOGY, 2022, 13
[28] Identifying Privacy Leakage from User-Generated Content in An Online Health Community - A deep learning approach
Zhu, Yushan
Tong, Xing
Wang, Xi
2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 407 - 408
[29] Midwifery learning and forecasting: Predicting content demand with user-generated logs
Guitart, Anna
Fernandez del Rio, Ana
Perianez, Africa
Bellhouse, Lauren
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 138
[30] A Supervised Machine Learning Approach for the Credibility Assessment of User-Generated Content
Jain, Praphula Kumar
Pamula, Rajendra
Ansari, Sarfraj
WIRELESS PERSONAL COMMUNICATIONS, 2021, 118 (04) : 2469 - 2485

← 1 2 3 4 5 →