Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization

被引:5
作者
Suh, Sangho [1 ]
Shin, Sungbok [2 ]
Lee, Joonseok [3 ]
Reddy, Chandan K. [4 ]
Choo, Jaegul [2 ]
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON, Canada
[2] Korea Univ, Dept Comp Sci & Engn, Seoul, South Korea
[3] Google Res, Machine Percept, Mountain View, CA USA
[4] Virginia Tech, Dept Comp Sci, Arlington, VA USA
基金
新加坡国家研究基金会; 美国国家科学基金会;
关键词
Topic modeling; Ensemble learning; Matrix factorization; Gradient boosting; Local weighting; CONSTRAINED LEAST-SQUARES; ALGORITHMS;
D O I
10.1007/s10115-017-1147-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nonnegative matrix factorization (NMF) has been widely used in topic modeling of large-scale document corpora, where a set of underlying topics are extracted by a low-rank factor matrix from NMF. However, the resulting topics often convey only general, thus redundant information about the documents rather than information that might be minor, but potentially meaningful to users. To address this problem, we present a novel ensemble method based on nonnegative matrix factorization that discovers meaningful local topics. Our method leverages the idea of an ensemble model, which has shown advantages in supervised learning, into an unsupervised topic modeling context. That is, our model successively performs NMF given a residual matrix obtained from previous stages and generates a sequence of topic sets. The algorithm we employ to update is novel in two aspects. The first lies in utilizing the residual matrix inspired by a state-of-the-art gradient boosting model, and the second stems from applying a sophisticated local weighting scheme on the given matrix to enhance the locality of topics, which in turn delivers high-quality, focused topics of interest to users. We subsequently extend this ensemble model by adding keyword- and document-based user interaction to introduce user-driven topic discovery.
引用
收藏
页码:503 / 531
页数:29
相关论文
共 40 条
  • [21] IMPROVING MATRIX FACTORIZATION-BASED RECOMMENDER VIA ENSEMBLE METHODS
    Luo, Xin
    Ouyang, Yuanxin
    Xiong, Zhang
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2011, 10 (03) : 539 - 561
  • [22] DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling
    Rundong Du
    Da Kuang
    Barry Drake
    Haesun Park
    Journal of Global Optimization, 2017, 68 : 777 - 798
  • [23] Simultaneous Dimensionality Reduction and Classification via Dual Embedding Regularized Nonnegative Matrix Factorization
    Wu, Wenhui
    Kwong, Sam
    Hou, Junhui
    Jia, Yuheng
    Ip, Horace Ho Shing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (08) : 3836 - 3847
  • [24] FTM: Recommending the Right Items for User Temporal Interests with Matrix Factorization through Topic Model
    Shang, Yanmin
    Xu, Kefu
    Han, Yi
    Zhang, Chuang
    2016 IEEE FIRST INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC 2016), 2016, : 189 - 198
  • [25] Topic Diffusion Discovery based on Sparseness-constrained Non-negative Matrix Factorization
    Kang, Yihuang
    Lin, Keng-Pei
    Cheng, I-Ling
    2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2018, : 94 - 101
  • [26] Bicriteria Sparse Nonnegative Matrix Factorization via Two-Timescale Duplex Neurodynamic Optimization
    Che, Hangjun
    Wang, Jun
    Cichocki, Andrzej
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4881 - 4891
  • [27] Item feature refinement using matrix factorization and boosted learning based user profile generation for content-based recommender systems
    Pujahari, Abinash
    Sisodia, Dilip Singh
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 206
  • [28] Item feature refinement using matrix factorization and boosted learning based user profile generation for content-based recommender systems
    Pujahari, Abinash
    Sisodia, Dilip Singh
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 206
  • [29] Predicting User Behavior in Display Advertising via Dynamic Collective Matrix Factorization
    Li, Sheng
    Kawale, Jaya
    Fu, Yun
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 875 - 878
  • [30] Energy disaggregation based on smart metering data via semi-binary nonnegative matrix factorization
    Miyasawa, Ayumu
    Fujimoto, Yu
    Hayashi, Yasuhiro
    ENERGY AND BUILDINGS, 2019, 183 : 547 - 558