LESS IS MORE: DEEP LEARNING USING SUBJECTIVE ANNOTATIONS FOR SENTIMENT ANALYSIS FROM SOCIAL MEDIA

被引:0
作者
Tzogka, Christina [1 ]
Passalis, Nikolaos [2 ]
Iosifidis, Alexandros [3 ]
Gabbouj, Moncef [2 ]
Tefas, Anastasios [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki, Greece
[2] Tampere Univ, Fac Informat Technol & Commun, Tampere, Finland
[3] Aarhus Univ, Dept Engn Elect & Comp Engn, Aarhus, Denmark
来源
2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP) | 2019年
关键词
Large-Scale Sentiment Analysis; Subjective Annotations; Deep Learning; Twitter Dataset;
D O I
10.1109/mlsp.2019.8918792
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acquiring reliable training annotations for the huge amounts of data collected in large-scale applications is often infeasible, especially for inherently subjective tasks, such as sentiment analysis. In these cases, the data are usually annotated using semi-automated methods. Even when crowd-sourcing is used, ensuring the quality of the acquired annotations can be challenging. Therefore, a number of important questions arise when annotating such datasets: Does using more data always increase the accuracy of a model regardless the quality of the annotations? Is there any way of selecting which data samples we should use when the annotations are unreliable? Is there any point at which using unreliable annotations actually harms the performance of deep models instead of helping? In this work we provide an extensive study on training deep sentiment analysis models with unreliably annotated data, as well as propose a simple, yet effective semi-supervised learning method to overcome the aforementioned limitations.
引用
收藏
页数:6
相关论文
共 23 条
[1]  
[Anonymous], 2014, SECONDARY TEXTBLOB S
[2]  
[Anonymous], 2017, IEEE CVPR
[3]  
[Anonymous], P ANN HAW INT C SYST
[4]  
[Anonymous], TWITTER SENTIMENT CL
[5]   New Avenues in Opinion Mining and Sentiment Analysis [J].
Cambria, Erik ;
Schuller, Bjoern ;
Xia, Yunqing ;
Havasi, Catherine .
IEEE INTELLIGENT SYSTEMS, 2013, 28 (02) :15-21
[6]   The commercial NLP landscape in 2017 [J].
Dale, Robert .
NATURAL LANGUAGE ENGINEERING, 2017, 23 (04) :641-647
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]   A Survey on Concept Drift Adaptation [J].
Gama, Joao ;
Zliobaite, Indre ;
Bifet, Albert ;
Pechenizkiy, Mykola ;
Bouchachia, Abdelhamid .
ACM COMPUTING SURVEYS, 2014, 46 (04)
[9]   Self-training from labeled features for sentiment analysis [J].
He, Yulan ;
Zhou, Deyu .
INFORMATION PROCESSING & MANAGEMENT, 2011, 47 (04) :606-616
[10]  
Hinton Geoffrey, 2015, NIPS DEEP LEARN WORK