An Embedding Learning Framework for Numerical Features in CTR Prediction

被引:65
作者
Guo, Huifeng [1 ]
Chen, Bo [1 ]
Tang, Ruiming [1 ]
Zhang, Weinan [2 ]
Li, Zhenguo [1 ]
He, Xiuqiang [1 ]
机构
[1] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2021年
基金
中国国家自然科学基金;
关键词
Numerical Features; Embedding Learning; Click-Through Rate Prediction; Neural Network;
D O I
10.1145/3447548.3467077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Click-Through Rate (CTR) prediction is critical for industrial recommender systems, where most deep CTR models follow an Embedding & Feature Interaction paradigm. However, the majority of methods focus on designing network architectures to better capture feature interactions while the feature embedding, especially for numerical features, has been overlooked. Existing approaches for numerical features are difficult to capture informative knowledge because of the low capacity or hard discretization based on the offline expertise feature engineering. In this paper, we propose a novel embedding learning framework for numerical features in CTR prediction (AutoDis) with high model capacity, end-to-end training and unique representation properties preserved. AutoDis consists of three core components: meta-embeddings, automatic discretization and aggregation. Specifically, we propose meta-embeddings for each numerical field to learn global knowledge from the perspective of field with a manageable number of parameters. Then the differentiable automatic discretization performs soft discretization and captures the correlations between the numerical features and meta-embeddings. Finally, distinctive and informative embeddings are learned via an aggregation function. Comprehensive experiments on two public and one industrial datasets are conducted to validate the effectiveness of AutoDis. Moreover, AutoDis has been deployed onto a mainstream advertising platform, where online A/B test demonstrates the improvement over the base model by 2.1% and 2.7% in terms of CTR and eCPM, respectively. In addition, the code of our framework is publicly available in MindSpore(1).
引用
收藏
页码:2910 / 2918
页数:9
相关论文
共 32 条
[1]  
[Anonymous], 2015, Tiny ImageNet Visual Recognition Challenge., DOI DOI 10.1109/ICCV.2015.123
[2]  
[Anonymous], 2016, RECSYS
[3]  
Cheng H.-T., 2016, P 1 WORKSH DEEP LEAR, P7
[4]   Deep Neural Networks for YouTube Recommendations [J].
Covington, Paul ;
Adams, Jay ;
Sargin, Emre .
PROCEEDINGS OF THE 10TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'16), 2016, :191-198
[5]   Compact and Low-Profile UWB Antenna Based on Graphene-Assembled Films for Wearable Applications [J].
Fang, Ran ;
Song, Rongguo ;
Zhao, Xin ;
Wang, Zhe ;
Qian, Wei ;
He, Daping .
SENSORS, 2020, 20 (09)
[6]  
Ginart Antonio, 2019, ARXIV190911810
[7]  
Grabczewski K, 2005, HIS 2005: 5TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, PROCEEDINGS, P212
[8]   Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems [J].
Gu, Yulong ;
Ding, Zhuoye ;
Wang, Shuaiqiang ;
Zou, Lixin ;
Liu, Yiding ;
Yin, Dawei .
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :2493-2500
[9]  
Guo HF, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1725
[10]  
He X., 2014, Proceedings of the Eighth International Workshop on data Mining for Online Advertising, P1, DOI [DOI 10.1145/2648584.2648589, 10.1145/2648584.2648589]