Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems

被引:71
|
作者
Zhou, Chang [1 ]
Ma, Jianxin [1 ]
Zhang, Jianwei [1 ]
Zhou, Jingren [1 ]
Yang, Hongxia [1 ]
机构
[1] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
来源
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2021年
关键词
Recommender systems; candidate generation; bias reduction; inverse propensityweighting; contrastive learning; negative sampling; NETWORK;
D O I
10.1145/3447548.3467102
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep candidate generation (DCG) that narrows down the collection of relevant items from billions to hundreds via representation learning has become prevalent in industrial recommender systems. Standard approaches approximate maximum likelihood estimation (MLE) through sampling for better scalability and address the problem of DCG in a way similar to language modeling. However, live recommender systems face severe exposure bias and have a vocabulary several orders of magnitude larger than that of natural language, implying that MLE will preserve and even exacerbate the exposure bias in the long run in order to faithfully fit the observed samples. In this paper, we theoretically prove that a popular choice of contrastive loss is equivalent to reducing the exposure bias via inverse propensity weighting, which provides a new perspective for understanding the effectiveness of contrastive learning. Based on the theoretical discovery, we design CLRec, a Contrastive Learning method to improve DCG in terms of fairness, effectiveness and efficiency in Recommender systems with extremely large candidate size. We further improve upon CLRec and propose Multi-CLRec, for accurate multi-intention aware bias reduction. Our methods have been successfully deployed in Taobao, where at least four-month online A/B tests and offline analyses demonstrate its substantial improvements, including a dramatic reduction in the Matthew effect.
引用
收藏
页码:3985 / 3995
页数:11
相关论文
共 50 条
  • [41] Datasets, tasks, and training methods for large-scale hypergraph learning
    Kim, Sunwoo
    Lee, Dongjin
    Kim, Yul
    Park, Jungho
    Hwang, Taeho
    Shin, Kijung
    DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (06) : 2216 - 2254
  • [42] Graph Representation Learning for Large-Scale Neuronal Morphological Analysis
    Zhao, Jie
    Chen, Xuejin
    Xiong, Zhiwei
    Zha, Zheng-Jun
    Wu, Feng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 35 (04) : 5473 - 5487
  • [43] Neural Binary Representation Learning for Large-Scale Collaborative Filtering
    Zhang, Yujia
    Wu, Jun
    Wang, Haishuai
    IEEE ACCESS, 2019, 7 : 60752 - 60763
  • [44] Large-scale asynchronous distributed learning based on parameter exchanges
    Joshi, Bikash
    Iutzeler, Franck
    Amini, Massih-Reza
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2018, 5 (04) : 223 - 232
  • [45] Datasets, tasks, and training methods for large-scale hypergraph learning
    Sunwoo Kim
    Dongjin Lee
    Yul Kim
    Jungho Park
    Taeho Hwang
    Kijung Shin
    Data Mining and Knowledge Discovery, 2023, 37 : 2216 - 2254
  • [46] A Combined Sensor Design Applied to Large-Scale Measurement Systems
    Pan, Xiao
    Ren, Huashuai
    Liu, Fei
    Li, Jiapei
    Cheng, Pengfei
    Deng, Zhongwen
    SENSORS, 2024, 24 (17)
  • [47] A SECURE COMMUNICATION FRAMEWORK FOR LARGE-SCALE UNMANNED AIRCRAFT SYSTEMS
    Bian, Jiang
    Seker, Remzi
    Xie, Mengjun
    2013 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE (ICNS), 2013,
  • [48] Large-Scale Antenna Systems and Massive Machine Type Communications
    de Figueiredo, Felipe A. P.
    Cardoso, Fabbryccio A. C. M.
    Miranda, Joao Paulo
    Moerman, Ingrid
    Dias, Claudio F.
    Fraidenraich, Gustavo
    INTERNATIONAL JOURNAL OF WIRELESS INFORMATION NETWORKS, 2020, 27 (03) : 317 - 339
  • [49] Enhancing the Role of Large-Scale Recommendation Systems in the IoT Context
    Kashef, Rasha
    IEEE ACCESS, 2020, 8 : 178248 - 178257
  • [50] Large-scale recordings for drug screening in neural circuit systems
    Ikegaya, Yuji
    YAKUGAKU ZASSHI-JOURNAL OF THE PHARMACEUTICAL SOCIETY OF JAPAN, 2008, 128 (09): : 1251 - 1257