The power of ensemble learning in sentiment analysis

被引:50
作者
Kazmaier, Jacqueline [1 ]
Vuuren, Jan H. van [1 ]
机构
[1] Stellenbosch Univ, Dept Ind Engn, Stellenbosch Unit Operat Res Engn, Private Bag 11, ZA-7602 Matieland, South Africa
关键词
Ensemble learning; Sentiment analysis; Machine learning; Natural language processing; OPTIMIZATION; CLASSIFIERS; SCHEME;
D O I
10.1016/j.eswa.2021.115819
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An ensemble of models is a set of learning models whose individual predictions are combined in such a way that component models compensate for each other's weaknesses. Although there has been a growing interest in ensemble learning techniques in the general machine learning community, the use of ensembles in sentiment classification is still limited. Moreover, much of the research activity on ensemble learning is centred around homogeneous ensembles, although heterogeneous ensembles may prove very useful when combining pre-trained models, which are often readily available. In this paper, several techniques for constructing heterogeneous ensembles are applied and comparatively evaluated in respect of benchmark sentiment classification data sets across four different domains. Median performance improvements of up to 5.53% over the best individual model are observed for several ensemble configurations in respect of all four validation data sets, and clear trends are identified that may prove useful to other researchers in the field. Furthermore, a novel ensemble selection approach is proposed that avoids the storage of individual predictions, as well as the costly retraining of all candidate models for an ensemble, that are often required by other similar approaches.
引用
收藏
页数:16
相关论文
共 57 条
[1]  
[Anonymous], 2015, J INTEL MAT SYST STR, DOI DOI 10.1177/1045389X14554132
[2]   Enhancing deep learning sentiment analysis with ensemble techniques in social applications [J].
Araque, Oscar ;
Corcuera-Platas, Ignacio ;
Sanchez-Rada, J. Fernando ;
Iglesias, Carlos A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 77 :236-246
[3]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[4]  
Caruana R., 2004, P 21 INT C MACHINE L, V18, DOI 10.1145/1015330.1015432
[5]  
Caruana R, 2006, IEEE DATA MINING, P828
[6]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[7]   Applying Ant Colony Optimization to configuring stacking ensembles for data mining [J].
Chen, Yijun ;
Wong, Man-Leung ;
Li, Haibing .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (06) :2688-2702
[8]  
DECONDORCET NC, 1785, ESSAI APPLICATION AN
[9]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[10]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139