A Multi-Model Approach for User Portrait

被引:20
作者
Chen, Yanbo [1 ]
He, Jingsha [1 ]
Wei, Wei [1 ]
Zhu, Nafei [1 ]
Yu, Cong [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
关键词
user portrait; machine learning; multi-model ensemble;
D O I
10.3390/fi13060147
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Age, gender, educational background, and so on are the most basic attributes for identifying and portraying users. It is also possible to conduct in-depth mining analysis and high-level predictions based on such attributes to learn users' preferences and personalities so as to enhance users' online experience and to realize personalized services in real applications. In this paper, we propose using classification algorithms in machine learning to predict users' demographic attributes, such as gender, age, and educational background, based on one month of data collected with the Sogou search engine with the goal of making user portraits. A multi-model approach using the fusion algorithms is adopted and hereby described in the paper. The proposed model is a two-stage structure using one month of data with demographic labels as the training data. The first stage of the structure is based on traditional machine learning models and neural network models, whereas the second one is a combination of the models from the first stage. Experimental results show that our proposed multi-model method can achieve more accurate results than the single-model methods in predicting user attributes. The proposed approach also has stronger generalization ability in predicting users' demographic attributes, making it more adequate to profile users.
引用
收藏
页数:14
相关论文
共 37 条
[1]  
[Anonymous], 2012, P 50 ANN M ASS COMP
[2]  
[Anonymous], 2012, P 21 INT C WORLD WID, DOI [DOI 10.1145/2187836.2187868, 10.1145/2187836.2187868]
[3]  
[Anonymous], 2014, INT C MACH LEARN
[4]  
[Anonymous], 2012, P INTERSPEECH 2012 P
[5]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[6]  
Berger AL, 1996, COMPUT LINGUIST, V22, P39
[7]  
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[8]   Multitask Learning of Deep Neural Networks for Low-Resource Speech Recognition [J].
Chen, Dongpeng ;
Mak, Brian Kan-Wing .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (07) :1172-1183
[9]  
Chen Tianqi, 2016, INT C KNOWLEDGE DISC
[10]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493