Multi-Faceted Knowledge-Driven Pre-Training for Product Representation Learning

被引:2
作者
Zhang, Denghui [1 ]
Liu, Yanchi [4 ]
Yuan, Zixuan [2 ]
Fu, Yanjie [5 ]
Chen, Haifeng
Xiong, Hui [3 ]
机构
[1] Rutgers State Univ, Informat Syst Dept, Newark, NJ 07103 USA
[2] Rutgers State Univ, Management Sci & Informat Syst Dept, Newark, NJ 07103 USA
[3] Rutgers State Univ, Newark, NJ 07103 USA
[4] NEC Labs Amer, Princeton, NJ 08540 USA
[5] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
基金
美国国家科学基金会;
关键词
Task analysis; Monitoring; Semantics; Pediatrics; Representation learning; Electronic publishing; Electronic commerce; Product representation learning; product search; product matching; product classification; pre-trained language models;
D O I
10.1109/TKDE.2022.3200921
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a key component of e-commerce computing, product representation learning (PRL) provides benefits for a variety of applications, including product matching, search, and categorization. The existing PRL approaches have poor language understanding ability due to their inability to capture contextualized semantics. In addition, the learned representations by existing methods are not easily transferable to new products. Inspired by the recent advance of pre-trained language models (PLMs), we make the attempt to adapt PLMs for PRL to mitigate the above issues. In this article, we develop KINDLE, a Knowledge-drIven pre-trainiNg framework for proDuct representation LEarning, which can preserve the contextual semantics and multi-faceted product knowledge robustly and flexibly. Specifically, we first extend traditional one-stage pre-training to a two-stage pre-training framework, and exploit a deliberate knowledge encoder to ensure a smooth knowledge fusion into PLM. In addition, we propose a multi-objective heterogeneous embedding method to represent thousands of knowledge elements. This helps KINDLE calibrate knowledge noise and sparsity automatically by replacing isolated classes as training targets in knowledge acquisition tasks. Furthermore, an input-aware gating network is proposed to select the most relevant knowledge for different downstream tasks. Finally, extensive experiments have demonstrated the advantages of KINDLE over the state-of-the-art baselines across three downstream tasks.
引用
收藏
页码:7239 / 7250
页数:12
相关论文
共 42 条
  • [1] A Zero Attention Model for Personalized Product Search
    Ai, Qingyao
    Hill, Daniel N.
    Vishwanathan, S. V. N.
    Croft, W. Bruce
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 379 - 388
  • [2] Learning a Hierarchical Embedding Model for Personalized Product Search
    Ai, Qingyao
    Zhang, Yongfeng
    Bi, Keping
    Chen, Xu
    Croft, W. Bruce
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 645 - 654
  • [3] Barkan Oren, 2016, IEEE INT WORKSHOP MA
  • [4] Bi KP, 2020, Arxiv, DOI arXiv:2005.08936
  • [5] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] AUTOKNOW: Self-Driving Knowledge Collection for Products of Thousands of Types
    Dong, Xin Luna
    He, Xiang
    Kan, Andrey
    Li, Xian
    Liang, Yan
    Ma, Jun
    Xu, Yifan Ethan
    Zhang, Chenwei
    Zhao, Tong
    Saldana, Gabriel Blanco
    Deshpande, Saurabh
    Manduca, Alexandre Michetti
    Ren, Jay
    Singh, Surender Pal
    Xiao, Fan
    Chang, Haw-Shiuan
    Karamanolakis, Giannis
    Mao, Yuning
    Wang, Yaqing
    Faloutsos, Christos
    McCallum, Andrew
    Han, Jiawei
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2724 - 2734
  • [8] Peters ME, 2019, Arxiv, DOI [arXiv:1909.04164, DOI 10.18653/V1/D19-1005]
  • [9] node2vec: Scalable Feature Learning for Networks
    Grover, Aditya
    Leskovec, Jure
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 855 - 864
  • [10] Han SG, 2020, Arxiv, DOI arXiv:2004.08476