Large-Scale Taxonomy Categorization for Noisy Product Listings

被引:0
|
作者
Das, Pradipto [1 ]
Xia, Yandi
Levine, Aaron
Di Fabbrizio, Giuseppe
Datta, Ankur
机构
[1] Rakuten Inst Technol, Boston, MA 02110 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
E-commerce catalogs include a continuously growing number of products that are constantly updated. Each item in a catalog is characterized by several attributes and identified by a taxonomy label. Categorizing products with their taxonomy labels is fundamental to effectively search and organize listings in a catalog. However, manual and/or rule based approaches to categorization are not scalable. In this paper, we compare several classifiers to product taxonomy categorization of toplevel categories. We first investigate a number of feature sets and observe that a combination of word unigrams from product names and navigational breadcrumbs work best for categorization. Secondly, we apply correspondence topic models to detect noisy data and introduce a lightweight manual process to improve dataset quality. Finally, we evaluate linear models, gradient boosted trees (GBTs) and convolutional neural networks (CNNs) with pre-trained word embeddings demonstrating that, compared to other baselines, GBTs and CNNs yield the highest gains in error reduction.
引用
收藏
页码:3885 / 3894
页数:10
相关论文
共 50 条
  • [1] Lightweight Methods for Large-Scale Product Categorization
    Cortez, Eli
    Herrera, Mauro Rojas
    da Silva, Altigran S.
    de Moura, Edleno S.
    Neubert, Marden
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2011, 62 (09): : 1839 - 1848
  • [2] Network of Experts for Large-Scale Image Categorization
    Ahmed, Karim
    Baig, Mohammad Haris
    Torresani, Lorenzo
    COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 516 - 532
  • [3] Large-Scale Personalized Categorization of Financial Transactions
    Lesner, Christopher
    Ran, Alexander
    Wang, Wei
    Rukonic, Marko
    AI MAGAZINE, 2020, 41 (03) : 63 - 77
  • [4] What Is Large in Large-Scale? A Taxonomy of Scale for Agile Software Development
    Dingsoyr, Torgeir
    Faegri, Tor Erlend
    Itkonen, Juha
    PRODUCT-FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROFES 2014, 2014, 8892 : 273 - 276
  • [5] Taxonomy and large-scale mapping of Russian soils
    Simakova, MS
    EURASIAN SOIL SCIENCE, 2005, 38 (12) : 1336 - 1341
  • [6] Learning Taxonomy Adaptation in Large-scale Classification
    Babbar, Rohit
    Partalas, Ioannis
    Gaussier, Eric
    Amini, Massih-Reza
    Amblard, Cecile
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [7] Error Detection in a Large-Scale Lexical Taxonomy
    An, Yinan
    Liu, Sifan
    Wang, Hongzhi
    INFORMATION, 2020, 11 (02)
  • [8] Large-scale Bayesian logistic regression for text categorization
    Genkin, Alexander
    Lewis, David D.
    Madigan, David
    TECHNOMETRICS, 2007, 49 (03) : 291 - 304
  • [9] Large-Scale Image Categorization with Explicit Data Embedding
    Perronnin, Florent
    Sanchez, Jorge
    Liu, Yan
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 2297 - 2304
  • [10] Benchmarking Large-Scale Fine-Grained Categorization
    Angelova, Anelia
    Long, Philip M.
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 532 - 539