ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance

被引:77
作者
Chen, Zhihong [1 ,2 ]
Xiao, Rong [2 ]
Li, Chenliang [3 ]
Ye, Gangfeng [2 ]
Sun, Haochuan [2 ]
Deng, Hongbo [2 ]
机构
[1] Zhejiang Univ, Inst Informat Sci & Elect Engn, Hangzhou, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan, Peoples R China
来源
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20) | 2020年
基金
中国国家自然科学基金;
关键词
Domain Adaptation; Ranking Model; Non-displayed Items;
D O I
10.1145/3397271.3401043
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most of ranking models are trained only with displayed items (most are hot items), but they are utilized to retrieve items in the entire space which consists of both displayed and non-displayed items (most are long-tail items). Due to the sample selection bias, the long-tail items lack sufficient records to learn good feature representations, i.e., data sparsity and cold start problems. The resultant distribution discrepancy between displayed and non-displayed items would cause poor long-tail performance. To this end, we propose an entire space adaptation model (ESAM) to address this problem from the perspective of domain adaptation (DA). ESAM regards displayed and non-displayed items as source and target domains respectively. Specifically, we design the attribute correlation alignment that considers the correlation between high-level attributes of the item to achieve distribution alignment. Furthermore, we introduce two effective regularization strategies, i.e., center-wise clustering and self-training to improve DA process. Without requiring any auxiliary information and auxiliary domains, ESAM transfers the knowledge from displayed items to non-displayed items for alleviating the distribution inconsistency. Experiments on two public datasets and a large-scale industrial dataset collected from Taobao demonstrate that ESAM achieves state-of-the-art performance, especially in the long-tail space. Besides, we deploy ESAM to the Taobao search engine, leading to significant improvement on online performance. The code is available at https://github.com/A-bonel/ESAM.git.
引用
收藏
页码:579 / 588
页数:10
相关论文
共 47 条
[1]  
[Anonymous], 2008, Advances in Neural Information Processing Systems NIPS
[2]  
[Anonymous], 2006, P 2006 IEEE COMP VIS, DOI DOI 10.1109/CVPR.2006.100
[3]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[4]   Causal Embeddings for Recommendation [J].
Bonner, Stephen ;
Vasile, Flavian .
12TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS), 2018, :104-112
[5]  
Burges C., 2005, ICML, P89
[6]  
Burges C.J.C., 2010, TECHNICAL REPORT MSR
[7]  
Cao Z., 2007, P 24 INT C MACHINE L, P129
[8]  
Chen C, 2019, AAAI CONF ARTIF INTE, P3296
[9]   Behavior Sequence Transformer for E-commerce Recommendation in Alibaba [J].
Chen, Qiwei ;
Zhao, Huan ;
Li, Wei ;
Huang, Pipei ;
Ou, Wenwu .
1ST INTERNATIONAL WORKSHOP ON DEEP LEARNING PRACTICE FOR HIGH-DIMENSIONAL SPARSE DATA WITH KDD (DLP-KDD 2019), 2019,
[10]  
Chen Z, 2019, ARXIV190510756