Online multi-label dependency topic models for text classification

被引：45

作者：

Burkhardt, Sophie ^{[1
]}

Kramer, Stefan ^{[1
]}

机构：

[1] Johannes Gutenberg Univ Mainz, Inst Comp Sci, Staudingerweg 9, D-55128 Mainz, Germany

来源：

MACHINE LEARNING | 2018年 / 107卷 / 05期

关键词：

Multi-label classification; Online learning; LDA; Topic model; ALGORITHM;

D O I：

10.1007/s10994-017-5689-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-label text classification is an increasingly important field as large amounts of text data are available and extracting relevant information is important in many application contexts. Probabilistic generative models are the basis of a number of popular text mining methods such as Naive Bayes or Latent Dirichlet Allocation. However, Bayesian models for multi-label text classification often are overly complicated to account for label dependencies and skewed label frequencies while at the same time preventing overfitting. To solve this problem we employ the same technique that contributed to the success of deep learning in recent years: greedy layer-wise training. Applying this technique in the supervised setting prevents overfitting and leads to better classification accuracy. The intuition behind this approach is to learn the labels first and subsequently add a more abstract layer to represent dependencies among the labels. This allows using a relatively simple hierarchical topic model which can easily be adapted to the online setting. We show that our method successfully models dependencies online for large-scale multi-label datasets with many labels and improves over the baseline method not modeling dependencies. The same strategy, layer-wise greedy training, also makes the batch variant competitive with existing more complex multi-label topic models.

引用

页码：859 / 886

页数：28

共 36 条

[1] On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking [J].

AlSumait, Loulwah ;

Barbara, Daniel ;

Domeniconi, Carlotta .

ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :3-12

[2]

[Anonymous], 2006, ICML, DOI DOI 10.1145/1143844.1143917

[3]

[Anonymous], 2006, Advances in Neural Information Processing Systems

[4]

[Anonymous], 2005, PROC 14 ACM INT C I, DOI DOI 10.1145/1099554.1099591

[5]

Asuncion A., 2009, C UNC ART INT UAI QU, P27, DOI DOI 10.1080/10807030390248483

[6]

Bengio Y., 2006, ADV NEURAL INFORM PR, V19

[7]

Bishop Christopher M, 2016, Pattern recognition and machine learning

[8]

Canini Kevin R., 2009, P INT C ART INT STAT, V5, P65

[9] On-line expectation-maximization algorithm for latent data models [J].

Cappe, Olivier ;

Moulines, Eric .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 :593-613

[10]

Foulds J, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P446

← 1 2 3 4 →