Detection of Advertising Users Based on K-SMOTE and Ensemble Learning

被引:0
作者
Qiu, Zihan [1 ]
Zhou, Zekai [1 ]
Long, Yongxu [1 ]
Ji, Chang [1 ]
Li, Jianguo [1 ]
Tang, Yong [1 ]
机构
[1] South China Normal Univ, Guangzhou 510630, Guangdong, Peoples R China
来源
HUMAN CENTERED COMPUTING, HCC 2021 | 2022年 / 13795卷
基金
中国国家自然科学基金;
关键词
Social network; Unbalanced datasets; User classification; SMOTE; K-Means; Ensemble learning;
D O I
10.1007/978-3-031-23741-6_12
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Aiming at the problem of the unbalanced advertising user data of social networks leading to unsatisfactory prediction results, we propose a prediction model for advertising users based on the combination among K-Means, synthetic minority oversampling Technique (SMOTE), and Ensemble Learning. On the basis of the real user data provided by Scholat, we analyzed the data and extracted many key features from it to draw a portrait of advertising users. Our algorithm first clusters the minority class, and then processes the continuous and discrete features of each sample separately through the improved SMOTE to synthesize new minority samples, and finally constructs an integrated classifier using the ensemble learning. This method effectively avoids the problems of blurred positive and negative class boundaries caused by SMOTE and the inability of SMOTE to process discrete features. Meanwhile, ensemble learning enables the classifier to get more reasonable results and reduce overall errors. The experimental results showthat our method improves the quality of the generated minority class samples and significantly improves the prediction performance of advertising users.
引用
收藏
页码:133 / 145
页数:13
相关论文
共 10 条
[1]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[2]   Detecting Spammers and Content Promoters in Online Video Social Networks [J].
Benevenuto, Fabricio ;
Rodrigues, Tiago ;
Almeida, Virgilio ;
Almeida, Jussara ;
Goncalves, Marcos .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :620-627
[3]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[4]   Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE [J].
Douzas, Georgios ;
Bacao, Fernando ;
Last, Felix .
INFORMATION SCIENCES, 2018, 465 :1-20
[5]   Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning [J].
Han, H ;
Wang, WY ;
Mao, BH .
ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 :878-887
[6]  
Meng X., 2014, SCI TECHNOL HUM VAL, V000, P125
[7]   Effective learning model of user classification based on ensemble learning algorithms [J].
Ruan, Qunsheng ;
Wu, Qingfeng ;
Wang, Yingdong ;
Liu, Xiling ;
Miao, Fengyu .
COMPUTING, 2019, 101 (06) :531-545
[8]   SYNTHETIC OVERSAMPLING OF INSTANCES USING CLUSTERING [J].
Sanchez, Atlantida I. ;
Morales, Eduardo F. ;
Gonzalez, Jesus A. .
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2013, 22 (02)
[9]  
Stringhini G, 2010, 26TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2010), P1
[10]  
Xixian P., 2015, INFORM SCIENCES, V033, P69