Efficient integration of generative topic models into discriminative classifiers using robust probabilistic kernels

被引:4
作者
Ihou, Koffi Eddy [1 ]
Bouguila, Nizar [1 ]
Bouachir, Wassim [2 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
[2] TELUQ Univ, Dept Sci & Technol, Montreal, PQ H2S 3L5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Hybrid (generative-discriminative) models; Support vector machine; Conjugate priors; Beta-Liouville; Generalized Dirichlet; Probabilistic kernels; Document classification; CLASSIFICATION; SPACE;
D O I
10.1007/s10044-020-00917-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an alternative to the generative classifier that usually models both the class conditionals and class priors separately, and then uses the Bayes theorem to compute the posterior distribution of classes given the training set as a decision boundary. Because SVM (support vector machine) is not a probabilistic framework, it is really difficult to implement a direct posterior distribution-based discriminative classifier. As SVM lacks in full Bayesian analysis, we propose a hybrid (generative-discriminative) technique where the generative topic features from a Bayesian learning are fed to the SVM. The standard latent Dirichlet allocation topic model with its Dirichlet (Dir) prior could be defined as Dir-Dir topic model to characterize the Dirichlet placed on the document and corpus parameters. With very flexible conjugate priors to the multinomials such as generalized Dirichlet (GD) and Beta-Liouville (BL) in our proposed approach, we define two new topic models: the BL-GD and GD-BL. We take advantage of the geometric interpretation of our generative topic (latent) models that associate aK-dimensional manifold (Kis the size of the topics) embedded into aV-dimensional feature space (word simplex) whereVis the vocabulary size. Under this structure, the low-dimensional topic simplex (the subspace) defines a document as a single point on its manifold and associates each document with a single probability. The SVM, with its kernel trick, performs on these document probabilities in classification where it utilizes the maximum margin learning approach as a decision boundary. The key note is that points or documents that are close to each other on the manifold must belong to the same class. Experimental results with text documents and images show the merits of the proposed framework.
引用
收藏
页码:217 / 241
页数:25
相关论文
共 103 条
[1]   On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking [J].
AlSumait, Loulwah ;
Barbara, Daniel ;
Domeniconi, Carlotta .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :3-12
[2]  
[Anonymous], 2009, Advances in neural information processing systems Vol, DOI DOI 10.1109/TPAMI.2015.2456899
[3]  
[Anonymous], ARXIV151203308
[4]  
[Anonymous], 2012, P 15 INT C ART INT S, P511
[5]  
Asuncion A., 2009, P 25 C UNCERTAINTY A, P27
[6]   A latent Beta-Liouville allocation model [J].
Bakhtiari, Ali Shojaee ;
Bouguila, Nizar .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 45 :260-272
[7]   A variational Bayes model for count data learning and classification [J].
Bakhtiari, Ali Shojaee ;
Bouguila, Nizar .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 35 :176-186
[8]  
Bakhtiari AS, 2014, LECT NOTES COMPUT SC, V8407, P286, DOI 10.1007/978-3-642-55032-4_28
[9]  
Banerjee A, 2005, J MACH LEARN RES, V6, P1345
[10]   Bayesian learning of inverted Dirichlet mixtures for SVM kernels generation [J].
Bdiri, Taoufik ;
Bouguila, Nizar .
NEURAL COMPUTING & APPLICATIONS, 2013, 23 (05) :1443-1458