Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content

被引:50
作者
Djuric, Nemanja [1 ]
Wu, Hao [1 ,2 ]
Radosavljevic, Vladan [1 ]
Grbovic, Mihajlo [1 ]
Bhamidipati, Narayan [1 ]
机构
[1] Yahoo Labs, Sunnyvale, CA USA
[2] Univ Southern Calif, Los Angeles, CA USA
来源
PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW 2015) | 2015年
关键词
Machine learning; document modeling; distributed representations; word embeddings; document embeddings;
D O I
10.1145/2736277.2741643
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.
引用
收藏
页码:248 / 255
页数:8
相关论文
共 20 条
  • [1] [Anonymous], 2007, P 24 INT C MACH LEAR, DOI DOI 10.1145/1273496.1273577
  • [2] [Anonymous], 2005, Aistats
  • [3] [Anonymous], 2014, ARXIV14062710
  • [4] Baeza-Yates R., 2015, P 8 ACM INT C WEB SE
  • [5] Bengio Y, 2001, ADV NEUR IN, V13, P932
  • [6] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [7] Bordes Antoine, 2013, Proceedings of the 26th International Conference on Neural Information Processing Systems, V26, P2787
  • [8] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [9] Collobert Ronan, 2008, ICML, P160, DOI [DOI 10.1145/1390156.1390177, 10.1145/1390156.1390177]
  • [10] Hidden Conditional Random Fields with Distributed User Embeddings for Ad Targeting
    Djuric, Nemanja
    Radosavljevic, Vladan
    Grbovic, Mihajlo
    Bhamidipati, Narayan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 779 - 784