Stacking-Based Ensemble Learning of Self-Media Data for Marketing Intention Detection

被引:18
作者
Wang, Yufeng [1 ]
Liu, Shuangrong [1 ]
Li, Songqian [1 ]
Duan, Jidong [1 ]
Hou, Zhihao [1 ]
Yu, Jia [1 ]
Ma, Kun [1 ,2 ]
机构
[1] Univ Jinan, Sch Informat Sci & Engn, Jinan 250022, Shandong, Peoples R China
[2] Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Jinan 250022, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
marketing intention; feature extraction; ensemble learning;
D O I
10.3390/fi11070155
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social network services for self-media, such as Weibo, Blog, and WeChat Public, constitute a powerful medium that allows users to publish posts every day. Due to insufficient information transparency, malicious marketing of the Internet from self-media posts imposes potential harm on society. Therefore, it is necessary to identify news with marketing intentions for life. We follow the idea of text classification to identify marketing intentions. Although there are some current methods to address intention detection, the challenge is how the feature extraction of text reflects semantic information and how to improve the time complexity and space complexity of the recognition model. To this end, this paper proposes a machine learning method to identify marketing intentions from large-scale We-Media data. First, the proposed Latent Semantic Analysis (LSI)-Word2vec model can reflect the semantic features. Second, the decision tree model is simplified by decision tree pruning to save computing resources and reduce the time complexity. Finally, this paper examines the effects of classifier associations and uses the optimal configuration to help people efficiently identify marketing intention. Finally, the detailed experimental evaluation on several metrics shows that our approaches are effective and efficient. The F1 value can be increased by about 5%, and the running time is increased by 20%, which prove that the newly-proposed method can effectively improve the accuracy of marketing news recognition.
引用
收藏
页数:12
相关论文
共 24 条
[1]  
Altszyler Edgar, 2017, ARXIV171210054
[2]  
Anandarajan M., 2019, Practical Text Analytics: Maximizing the Value of Text Data, P77
[3]  
[Anonymous], 2017, ARXIV170609274
[4]  
[Anonymous], 2016, KDD16 P 22 ACM, DOI DOI 10.1145/2939672.2939785
[5]   Learning Multi-Domain Adversarial Neural Networks for Text Classification [J].
Ding, Xiao ;
Shi, Qiankun ;
Cai, Bibo ;
Liu, Ting ;
Zhao, Yanyan ;
Ye, Qiang .
IEEE ACCESS, 2019, 7 :40323-40332
[6]   A Human Activity Recognition Algorithm Based on Stacking Denoising Autoencoder and LightGBM [J].
Gao, Xile ;
Luo, Haiyong ;
Wang, Qu ;
Zhao, Fang ;
Ye, Langlang ;
Zhang, Yuexia .
SENSORS, 2019, 19 (04)
[7]  
Jun Liu, 2018, 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS). Proceedings, P820, DOI 10.1109/ICIS.2018.8466463
[8]  
Kalra S, 2019, ARXIV190307406
[9]  
Khaleel M. I., 2016, P 3 MULT INT SOC NET, P31
[10]   Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec [J].
Kim, Donghwa ;
Seo, Deokseong ;
Cho, Suhyoun ;
Kang, Pilsung .
INFORMATION SCIENCES, 2019, 477 :15-29